[tex-live] TL expl3 update broke a mwe for me

Ingo Krabbe ikrabbe.ask at gmail.com
Sun Jan 3 08:34:17 CET 2016


> ! Undefined control sequence.
> <argument> ...or: for example, they allow "MASSE" and "Maß
>                                                   e" to match. 
> l.4195   \__unicode_map_inline:n { CaseFolding.txt }

This looks like an encoding error.  It would help if you copy and paste the strange output into od or xxd for example.

Your non ascii sequence seems to be C3 83 C2 9F, which appears as a double UTF-8 encoding or something similar. Either the encoding of your mail, the encoding of your system or the encoding of the CaseFolding.txt file is bad, I would bet.

With your numbers above, written in binary form you have:

	11000011 10000011

and

	11000010 10011111

that are quickly calculated into ascii / unicode numbers through the guessed utf-8 encoding

          01.   x in [000000.00000000.0bbbbbbb] → 0bbbbbbb
          10.   x in [000000.00000bbb.bbbbbbbb] → 110bbbbb, 10bbbbbb
          11.   x in [000000.bbbbbbbb.bbbbbbbb] → 1110bbbb, 10bbbbbb, 10bbbbbb
          100. x in [bbbbbb.bbbbbbbb.bbbbbbbb] → 1110bbbb, 10bbbbbb, 10bbbbbb, 10bbbbbb

where we just need the 2nd (10) rule, here.

	decode_utf8(11000011 10000011) = 000 1100 0011
	decode_utf8(11000010 10011111) = 000 1001 1111

This again is a UTF-8 sequence (guessed again).

	decode_utf8(11000011 10011111) = 1101 1111 = DF

	unicode DF = ß	(latin small letter sharp s)

So "Masse and Maße" match.

First shot: What is your system encoding. Most systems now use UTF-8 encodings. Check your locale, by just typing locale. This is an output for my system:

	# locale
	LANG=en_US.UTF-8
	LC_CTYPE=de_DE.UTF-8
	LC_NUMERIC=de_DE.UTF-8
	LC_TIME=de_DE.UTF-8
	LC_COLLATE=de_DE.UTF-8
	LC_MONETARY=de_DE.UTF-8
	LC_MESSAGES="en_US.UTF-8"
	LC_PAPER="en_US.UTF-8"
	LC_NAME="en_US.UTF-8"
	LC_ADDRESS="en_US.UTF-8"
	LC_TELEPHONE="en_US.UTF-8"
	LC_MEASUREMENT="en_US.UTF-8"
	LC_IDENTIFICATION="en_US.UTF-8"
	LC_ALL=

Try your example with a utf8 system encoding.

regards

ingo



More information about the tex-live mailing list