[tex-live] [luatex] [lltx] Location of recorder file

Philipp Stephani st_philipp at yahoo.de
Thu May 19 07:55:30 CEST 2011


Am 19.05.2011 um 01:19 schrieb Reinhard Kotucha:

> lexlua doesn't
> support locales, in sake of portability.

I think it does support locales, it simply doesn't fetch them from the environment by default, but os.setlocale("") should work.

>> The problem remains that unless you wrap CreateProcessW or at least
>> wspawn* directly, you never get Unicode support on
>> Windows. Therefore you have to add platform-specific code
>> anyway. It's a bit less if you call spawn/wspawn, but it's still
>> there.  If you really want to make Lua truly platform-independent,
>> you have to add those Unicode wrapper functions for Windows. No
>> Unicode support on a Unicode operating system is almost
>> unacceptable.
> 
> In principle, yes.  But I'm sceptical in respect of Unicode file
> names.  Sounds non-trivial.

It is. :-(
The coding itself is trivial (e.g. using _wfopen instead of fopen on Windows), but the conceptual problem lies deeper. On Unix-like systems, C strings (char*) are just byte sequences without direct textual interpretation; the interpretation as a text string is locale-based. Therefore you can pass the kernel in its open syscall any byte sequence, regardless of whether it gives an useful interpretation in the current locale. In general you can't tell what encoding char* uses. On Windows, however, strings use always UTF-16: the kernel expects UTF-16 for file names, the console works with UTF-16 strings (actually UCS-2), UI controls return their textual content in UTF-16, etc. Therefore you always know the encoding of a given text string on Windows. If you use char* arrays, you can use UTF-8, no problem, but then *everybody* has to agree on this. This is the non-trivial part: you have to follow each string until you arrive at its source, and there you have to make sure that you get the right encoding. But there are many such sources: the command line, configuration files, \input commands, ...

> Does io.open() in stock Lua convert UTF-8
> encoded file names to UTF-16 on Windows?

I haven't looked at the code, but I'm sure it doesn't. Lua uses only standard C functions (except for dynamical linking), and io.open is just fopen, which wraps CreateFileA, which expects a legacy-encoded string (Windows-1252 on "western" systems). So changing io.open to use _wfopen instead would break compatibility to stock Lua.
I think this is something worth pursuing, but certainly not for TeX Live 2011.


More information about the tex-live mailing list