[XeTeX] anti-xunicode ;-)
Adam Twardoch
list.adam at twardoch.com
Sun Jul 23 15:09:56 CEST 2006
Ralf Stubner wrote:
> I am not sure if it where legal to
> reorder such a decomposed sequence.
>
Unicode assigns different combining classes to different diacritical
marks, and prescribe a canonical order of marks. For example, the
canonical order for the Yoruba character we’ve been discussing is
\u0045\u0323\u0301 and not \u0045\u0301\u0323.
However, both sequences are canonically equivalent, and Unicode
recommends: "Rendering systems should handle any of the canonically
equivalent orders of combining marks."
(See Unicode section 3.11 and 5.13, PDFs available from
http://www.unicode.org/versions/Unicode4.1.0/ ).
A rendering system can reorder marks to their canonical form or can just
attempt to render them as they are. Well-made fonts should not rely on
the application doing the reordering, and should have provisions for all
mark combinations necessary. Note that for marks that are either all
above or all below, the sequence in which they are typed is significant.
For example, E followed by acute followed by grave should be rendered so
that the grave is above the acute, but E followed by grave followed by
acute should be rendered so that the acute is above the grave. So these
would not be canonically equivalent. But combinations of marks that have
different locations, such as one below and one above, as in the example
I used, would be. \u0045\u0323\u0301 and \u0045\u0301\u0323 should be
processed and rendered the same.
Of course XeTeX would do good if it did canonical reordering of marks.
As I’ve written, *well-made* fonts should not rely on marks being
canonically ordered, but some fonts will only contain rendering rules
for canonically ordered marks. Canonical reordering surely would
minimize the risk of bad renderings.
Regards,
Adam
--
Adam Twardoch
http://www.twardoch.com/
More information about the XeTeX
mailing list