[XeTeX] Re: Unicode/font mixing
Jonathan Kew
jonathan_kew at sil.org
Tue Jan 25 12:44:16 CET 2005
On 25 Jan 2005, at 11:08 am, Simon Spiegel wrote:
> Thank you for your answer. Maybe this is a naive question, but how are
> people supposed to deal with this sitation. I thought one of the main
> points of Unicode is that you don't have to change encoding all the
> time.
You don't have to change *encoding*; but you may still need to change
*font*, as there is no font that supports every character in Unicode.
> If I have to manually change the used font, everytime I use a
> non-Latin glyph, I don't see much advantage?
Yes, for non-Latin characters, you may well need to change fonts
(depending on your choice of typefaces). Some typefaces may cover
several scripts (e.g., Latin/Greek/Cyrillic), so if you have a document
that mixes these scripts, it may be appropriate to choose such a
typeface.
The "advantage" of Unicode isn't so much that it lets you forget about
changing fonts as that it means your actual text data can be in a
standard, documented, interoperable encoding, rather than in a mixture
of more- or less-well-understood legacy encodings and/or special
control sequences, where a given byte value means one thing in one word
and an entirely different thing in another word; the data cannot be
reliably understood without knowledge of the specific fonts used to
render it. With Unicode, the meaning of the characters is
unambiguous--even in the absence of any font that can render them!
> Or do I have just have restrict myself to the few fonts which support
> a great of Unicode glyphs if I want XeLaTeX to behave this way? One of
> the reason I'm asking this is because BibDesk is getting unicode
> support, which would allow me to enter non-Latin characters directly
> into bibTeX files (bibtex is kind of encoding agnostic). How are other
> people handling this situation?
At the moment, I typically mark fragments in non-Latin languages using
shorthand commands created to suit the occasion; e.g., {\ar some Arabic
text} or {\dn Devanagari}. But I'm not a LaTeX user (much), nor a
BibDesk user (at all), so I'm not the one to comment on the best
approach there.
There have been requests in the past for a way to, in effect, declare
several "current fonts" each covering a different Unicode range, so
that mixed-script text wouldn't require explicit font changes. This is
an interesting possibility, but coming up with a design that would
reliably do "the right thing", especially with characters such as
punctuation or numerals that may be "shared" between scripts, is not a
trivial thing.
JK
More information about the XeTeX
mailing list