[tex-live] Luatex formats and hyphenation patterns

Heiko Oberdiek oberdiek at uni-freiburg.de
Tue Feb 23 22:22:57 CET 2010


On Tue, Feb 23, 2010 at 12:35:50PM +0200, Élie Roux wrote:

> 2010/2/23 Karl Berry <karl at freefriends.org>:
> >
> > Aside from anything else, it's certainly inadvisable to duplicate the
> > entire file.  Instead, a conditional should be used that we can
> > set/unset in the different formats.  I realize you were just doing the
> > expedient thing for an experiment, but if we want to ship a change like
> > this, it should be done in a clean way.
> >
> > Indeed.  If you, or someone, can do that, we can consider adopting the
> > change for plain luatex.
> 
> thank you Khaled for talking care of this issue! Here is a new
> attempt, taking into account Karl's remarks... the changes are now
> integrated into etex.src, the patterns are loaded only once, I added a
> log message and I also changed \fmtversion not to have the languages
> in it (in order for it to be coherent during all the compilation)...
> but I'm not 100% sure it's a good idea, especially since I'm not
> familiar with Plain logs. Wdyt?
> 
> Thank you,
> -- 
> Elie

> --- etexold.src	2010-02-23 09:46:04.401120400 +0200
> +++ etex.src	2010-02-23 10:31:53.271872000 +0200
> @@ -275,6 +275,18 @@
>          \language=\csname lang@#1\endcsname
>          \lefthyphenmin=\csname lhm@#1\endcsname
>          \righthyphenmin=\csname rhm@#1\endcsname
> +        % special case of LuaTeX: we load patterns at run-time
> +        \ifcsname directlua\endcsname
> +            % loading patterns if not loaded yet
> +            \ifcsname plded@#1\endcsname\else
> +                \input \csname pat@#1\endcsname
> +                \ifcsname exp@#1\endcsname
> +                    \input \csname exp@#1\endcsname
> +                \fi
> +                \expandafter \def \csname plded@#1\endcsname {1}

* \patterns (and \hyphenation) are *global* assignments,
  therefore \gdef should be used to define "\plded@#1".

* Loading at runtime means, the catcodes are not known any longer.
  Before loading a file, the correct catcodes must be set
  and restored after loading. If there is a resource managment
  for catcode tables, then a catcode table can be used for
  this purpose.

LuaTeX also provides library `lang'. If I remember Taco correctly,
the patterns as plain Lua string are faster than \patterns.
Loading the patterns via Lua also solves the catcode problem.
Probably it is not to difficult for the hyph-utf8 project
to provide lua pattern input files that are generated automatically
from the project data?

A strict name convention can help to detect whether the 
patterns for a language is available as Lua code and/or TeX code.

And not to forget: hyphenation depends on the font encoding ...

Unhappily I don't have time in the next weeks for implementing
and experimenting.

Yours sincerely
  Heiko <oberdiek at uni-freiburg.de>


More information about the tex-live mailing list