[XeTeX] anti-xunicode ;-)
Ross Moore
ross at ics.mq.edu.au
Mon Jul 24 05:18:34 CEST 2006
Hi Ralf,
On 23/07/2006, at 10:21 PM, Ralf Stubner wrote:
> Ross Moore <ross at ics.mq.edu.au> writes:
>> xunicode.sty could (and perhaps should) be modified to also
>> include
>> a check for the precomposed glyph, then use the fall-back if not
>> found.
>
> I don't think this is necessary, since XeTeX actually does this
> checking
> allready. For example, the eogonek in
>
> \documentclass[a4paper]{article}
> \usepackage{fontspec,xunicode}
> \setromanfont[Mapping=tex-text]{Gentium}
> \begin{document}
> \k{e}
> \UndeclareUTFcomposite[U]{x0119}{\k}{e}
> \k{e}
> e\char"0328
> \end{document}
>
> uses the procomposed form in all three cases, even though Gentium does
> not contain any smart features like mark or ccmp. ('mark' wouldn't
> help
> here much anyway).
This is a property of the font, surely,
so cannot be relied upon in general.
> I am not sure, but maybe this means that many of the
> '\DeclareUTFcomposite' in xunicode.sty are not necesary.
I don't see this as a consequence at all.
This is the mechanism whereby LaTeX is told that a particular
combination of accent and base-letter is available as a
single character, within the declared encoding.
>
> BTW, XeTeX also seems to decompose characters while looking for the
> best
> way to render them. Example: The current development version of FPL
> Neu
> does not contain an Obreve U+014E. It does contain a combining breve
> U+0306 with suitable 'mark' features, though. This is correctly used
> regardless of whether I input Obreve as precomposed character or in
> decomposed form.
Again, a property of the font surely?
>> The above examples show that XeLaTeX already has the ability to do
>> what the OP requested; namely to have a sequence of fall-backs
>> available for accented characters, according to what is available
>> in the font, or font-encoding.
>
> I am not sure. I think the OP wanted to use Unicode input, not TeX
> commands. Adam has explained the different possibilities that exist
> when
> one starts from decomposed input, which can be assumed without loss of
> generality:
>
> a) look for a composed form
> b) look for a matching 'ccmp' feature
> c) look for an applicable 'mark' (or 'mkmk') feature
> d) some fallback
Sure. But one needs a way of achieving this within the context
of multiple fonts in the same document, perhaps with multiple
different encodings, and only wanting the fallback to be used
in some (but not all) situations.
Thus with declarations such as:
\catcode `ḍ = \active
\DeclareRobustCommand{ḍ}{%
\iffontchar\font"1E0D\char"1E0D%
\else
... fall back expansion ...
\fi}
the problem is to define the "fallback expansion" appropriately.
I'm asserting that, with appropriate commands in the header
to support a (pseudo-)encoding 'UX' say, then this should
be something like:
{\changeencoding{UX}\d{d}}
to temporarily (note the extra braces) change the encoding,
so as to make use of TeX's default handling of accents by
a box-like construction.
Perhaps better is to first check for a combining character,
and use this --- if it actually produces acceptable results:
\iffontchar\font"1E0D\relax d\char"1E0D\else
{\changeencoding{UX}\d{d}}\fi
Alternatively, you might wish to force the fall-back method
with a particular character in a particular font; e.g.,
where ḍ *is* available, but you don't like its appearance.
Here's how to do it, without upsetting how ḍ is handled
as a normal character with other fonts in the same document.
{\catcode `ḍ = \active
\newcommand{ḍ}{{\changeencoding{UX}\d{d}}}
\global\let\composeddot ḍ
\gdef\activateddot{\catcode `ḍ = \active\let ḍ\composeddot}
}
Now whenever you switch to that particular font, you need
to also \activateddot ; e.g.
\newcommand\selectGentium{%
\setromanfont[Mapping=tex-text]{Gentium}%
\activateddot}
and only use \selectGentium within a grouping or environment,
so that the \catcode change remains contained.
For a robust version of this use instead:
\catcode `ḍ = \active
\DeclareRobustCommand{\ddotu}{{\changeencoding{UX}\d{d}}}
\def\activateddotu{\catcode `ḍ = \active\let ḍ\ddotu}
\catcode `ḍ = 12
so that \section{All about ḍ.}
will write out the aux-file string as: All about \ddotu .
Make sure that, when the T-of-C is constructed, there is an
expansion for \ddotu appropriate to the font being used there.
The extra header declarations required to support this technique
was in a previous email, but I repeat it here for completeness.
% Define a macro to change encoding flag, without requiring
% the support of an <encoding>enc.def file.
\makeatletter
\def\changeencoding#1{\def\cf at encoding{#1}}
\makeatother
% Declare an accent with UX encoding, to fall-back
% to using the OT1 (or T1) \add at accent method.
% Make sure this *precedes* loading any packages
% that use \DeclareTextAccent for the same accent;
% e.g. hyperref.sty for PD1-encoding.
% umlaut using the "00A8 character
\let\realaccent\"
\DeclareTextAccent{\"}{UX}{"00A8}
\let\"\realaccent
% ogonek using the "02DB character
\let\realaccent\k
\DeclareTextAccent{\k}{UX}{"02DB}
\let\k\realaccent
% dot-above using the "02D9 character
\let\realaccent\.
\DeclareTextAccent{\.}{UX}{"02D9}
\let\.\realaccent
% dot-under using the "002E character (full-stop)
% or we could use the combining-char at "0323
% for a lower, larger dot in some fonts
\let\realaccent\d
\DeclareTextAccent{\d}{UX}{"002E}
% this next line is because \accent is *not* used for this
\expandafter\let\csname UX\string\d\expandafter\endcsname\csname OT1
\string\d\endcsname
\let\d\realaccent
% It seems to be immaterial whether these are loaded before or after:
\usepackage{fontspec}
\usepackage{xunicode}
> I have the impression that XeTeX implements a)-c), which is really
> great, since that way XeTeX tries to use everything that the font
> has to
> offer. The current fallback d) is very simple, though. The combining
> accent is just printed as it is in the font. Depending an the precise
> position of this accent, this may or may not look good.
TeX can do better than this, so why not let it?
> If the font
> doesn't even have combining accents (eg Minion Pro), the '.notdef'
> glyph
> is used.
>
> Here it would be useful if one where able to define a TeX command
> which
> is used if none of a)-c) has succeeded to find a suitable glyph. I
> think
> the OP tried to implement something like this, but at a to early
> stage,
> ie, before b) and c) have been tried, since AFAIK only a) can be done
> directly using TeX commands.
I think the above methods cater for all the possibilities,
and do so in the most compatible way with alternative uses
of the same characters in different fonts, within the same
LaTeX document.
>
> cheerio
> ralf
Hope this helps,
Ross
------------------------------------------------------------------------
Ross Moore ross at maths.mq.edu.au
Mathematics Department office: E7A-419
Macquarie University tel: +61 +2 9850 8955
Sydney, Australia 2109 fax: +61 +2 9850 8114
------------------------------------------------------------------------
More information about the XeTeX
mailing list