[XeTeX] Table of contents
Ross Moore
ross at ics.mq.edu.au
Sat May 1 00:51:46 CEST 2010
Hi Jonathan,
On 01/05/2010, at 4:06 AM, Jonathan Kew wrote:
> The problem is that at this point, the .aux file is read *with*
> your \XeTeXdefaultencoding declaration in force, so the individual
> utf-8 bytes that were written to it now get interpreted as cp1252
> characters and mapped to their Unicode values, instead of the byte
> sequences being interpreted as utf-8. That's the source of the
> "junk" you're getting. Those utf-8-bytes-interpreted-as-cp1252 then
> get re-encoded to utf-8 sequences as the .toc is written, so in
> effect the original characters have been "doubly encoded".
This sounds like a pretty generic kind of problem, ...
>
> In this particular case, at least, you can work around the problem
> by resetting the default encoding immediately before the end of the
> document, so that when LaTeX reads in the .aux file at the end of
> the run, it reads it correctly as utf-8. In other words, if you
> modify this example to become:
>
> \documentclass[10pt,a4paper]{book}
> \usepackage[frenchb]{babel}
> \usepackage{fontspec}
> \usepackage{xunicode}
> \usepackage{xltxtra}
> \begin{document}
> \frontmatter
> \tableofcontents
> \XeTeXinputencoding "cp1252"
> \XeTeXdefaultencoding "cp1252"
> \mainmatter\setcounter{secnumdepth}{2}
> \chapter{Général de Gaulle}
> Il était français.
> \XeTeXdefaultencoding "utf-8"
> \end{document}
>
> then your table of contents should correctly show "Général".
... so that the best solution might be to include
a command such as:
\AtEndDocument{\XeTeXdefaultencoding "utf-8"}
into the xltxtra package, so that it becomes something
that is always done, and authors do not need to worry about it.
Note that the \@enddocumenthook is expanded more or less
immediately after the \end{document} has been encountered.
Certainly before the .aux file is closed for writing,
and re-opened for reading.
viz. (from latex.ltx )
>>> \def\enddocument{%
>>> \let\AtEndDocument\@firstofone
>>> \@enddocumenthook
>>> \@checkend{document}%
>>> \clearpage
>>> \begingroup
>>> \if at filesw
>>> \immediate\closeout\@mainaux
>>> \let\@setckpt\@gobbletwo
>>> \let\@newl at bel\@testdef
>>> \@tempswafalse
>>> \makeatletter \input\jobname.aux
>>> \fi
> However, there may be other situations where auxiliary files are
> written and read at unpredictable times during the processing of
> the document, making it more difficult to control the encodings at
> the right moments.
True. That gives another advantage in having the solution
recorded in a standard place such as xltxtra.sty ,
preferably with some comments about why it is useful.
Then it can be found, and the solution patched-in to
the coding where other kinds of auxiliary files are
being written and read back in.
> In general, moving to an entirely utf-8 environment is a better and
> more robust way forward.
True again, for new documents.
It is still desirable to provide solutions that cope
with technicalities that occur in other situations.
>
> HTH,
>
> Jonathan
All the best,
Ross
------------------------------------------------------------------------
Ross Moore ross at maths.mq.edu.au
Mathematics Department office: E7A-419
Macquarie University tel: +61 (0)2 9850 8955
Sydney, Australia 2109 fax: +61 (0)2 9850 8114
------------------------------------------------------------------------
More information about the XeTeX
mailing list