[XeTeX] anti-xunicode ;-)
Ralf Stubner
ralf.stubner at physik.uni-erlangen.de
Tue Jul 25 00:45:44 CEST 2006
Hi Ross,
Ross Moore <ross at ics.mq.edu.au> writes:
>> uses the procomposed form in all three cases, even though Gentium does
>> not contain any smart features like mark or ccmp. ('mark' wouldn't
>> help
>> here much anyway).
>
> This is a property of the font, surely,
> so cannot be relied upon in general.
I think the only font property involved here is that the precomposed
glyph exists abd is advertized via the cmap table. Gentium does not
contain any smart rendering features. I can't tell if that works when
fonts are accessed via ATSUI, though.
>> I am not sure, but maybe this means that many of the
>> '\DeclareUTFcomposite' in xunicode.sty are not necesary.
>
> I don't see this as a consequence at all.
> This is the mechanism whereby LaTeX is told that a particular
> combination of accent and base-letter is available as a
> single character, within the declared encoding.
I am just not sure if LaTeX needs to know that a certain accented
character is available, since XeTeX seems to find that character when
presented with suitable decomposed input.
>> BTW, XeTeX also seems to decompose characters while looking for the
>> best
>> way to render them. Example: The current development version of FPL
>> Neu
>> does not contain an Obreve U+014E. It does contain a combining breve
>> U+0306 with suitable 'mark' features, though. This is correctly used
>> regardless of whether I input Obreve as precomposed character or in
>> decomposed form.
>
> Again, a property of the font surely?
It is a property of the font the <O><brevecomb> is rendered correctly
even though there is no <Obreve> in the font. It is a property of XeTeX
that this decomposed form is tried even though the input contains
precomposed characters.
>> a) look for a composed form
>> b) look for a matching 'ccmp' feature
>> c) look for an applicable 'mark' (or 'mkmk') feature
>> d) some fallback
>
> Sure. But one needs a way of achieving this within the context
> of multiple fonts in the same document, perhaps with multiple
> different encodings, and only wanting the fallback to be used
> in some (but not all) situations.
I think we are discussing different topics here. You are explaining what
is possible *now*, which is very valuable, while I am doing sort of a
brainstorming about possible extensions. Right now we have to make
characters active in order to provide suitable fall backs. I see at
least two problems with this approach, though:
* With XeTeX one can test for a glyph for the precomposed character. One
can also test for a combining accent. One cannot test for the
existance of suitable smart font features that make the latter work
properly, though. Imagine the case of U+1E0B ḋ together with a font
that contains a combining dot accent but no prebuild ḋ. The ideal
rendering for this character might have the dot to the left of the
ascender of the d. Such a behaviour can be implemeted via a 'mark'
feature, but we can't be sure. Just letting XeTeX render <d><dotcomb>
might also produce results where TeX's fallback of centering the dot
above the <d> would be prefereable.
* It is not extensible to charcters without a precomposed form in
Unicode. For example, the E with dot below and acute above mentioned
in this thread does not exist in Unicode in precomposed form. One has
to input it via combining accents. And I currently don't see any way
to define fallback operations on the TeX-level for combining accents.
I think this is the place where an additional hook would be useful,
but that requires an extension of XeTeX.
The TeX-coding for such a fall back would probably be along the lines
you explained. I just think that such a fall back should be used as late
as possible.
cheerio
ralf
More information about the XeTeX
mailing list