[XeTeX] Re: XeTeX & Unicode vs. standard LaTeX
christopher ciotti
chris_ciotti at yahoo.com
Mon Oct 11 15:53:52 CEST 2004
On Oct 10, 2004, at 6:10 PM, Ross Moore wrote:
Hi -
Thanks for the info. My approach was more to get me through a long
document than every document :-)
I almost never have to set math or computer code but I'll look into the
command you describe in your message. Thanks again.
> This is already incorporated into utf8accents.sty , along with a lot
> more
> commands from the various other font-encoding (.enc) files.
>
> Just simply making \renewcommand definitions isn't really the right
> approach.
> It's fine for a single document using just one kind of font.
> But if you are mixing different fonts (e.g. because of mathematics,
> computer code, multiple languages, etc.) then you may need a
> high-level macro
> such as \textdollar to result in a different character depending upon
> the font
> being used in the particular context.
> (Is it a tfm-based CM or Euler, or an AAT or OTF font ?)
>
>
> Thus you want the high-level definition to be done in such a way that
> the current \fontencoding is taken into account.
>
> LaTeX provides commands for this:
> \DeclareTextCommand \DeclareTextSymbol \DeclareTextAccent
> and 'Default' versions:
> \DeclareTextCommandDefault \DeclareTextSymbolDefault
> \DeclareTextAccentDefault
> as well as
> \DeclareTextComposite and \DeclareTextCompositeCommand
> and
> \DeclareTextFontCommand for defining font-switching macros.
>
>
> These are the commands that should be used, wherever possible.
> Alternatively study the innards of how these work, and mimic that.
>
> The latter is what is done in utf8accents.sty with its commands
>
> \DeclareUTFcharacter
> (for a Unicode version of \DeclareTextCharacter)
>
> and
>
> \DeclareEncodedCompositeCharacter
> \DeclareEncodedCompositeAccents
>
> for handling accents and other composite-pair constructions.
>
>
> Thus many issues of backwards-compatibility with existing (La)TeX
> practices are solved for XeTeX simply by loading utf8accents.sty .
>
> As there have been quite a few requests for this lately,
> here it is again (in version v0.4).
>
> <utf8accents.sty>
>
>
>
> However utf8accents.sty doesn't solve the ligature problems,
> which are of a quite different character (sic).
> That's why the following is such great news ...
>
>>> However, we obviously cannot expect mainstream font vendors to add
>>> support for TeX's unique keying conventions to their font tables.
>>> Therefore, I have just implemented a "font mapping" scheme (this was
>>> first suggested on the XeTeX list by Ross Moore, IIRC), which allows
>>> an arbitrary mapping of Unicode character sequences to be associated
>>> with a particular font. So having defined a mapping "tex-text" that
>>> includes entries such as:
>>>
>>> U+002D U+002D > U+2013 ; endash
>>> U+002D U+002D U+002D > U+2014 ; emdash
>>> U+0060 U+0060 > U+201C ; opening double quote
>>> ; etc....
>>>
>>> I can then load a font with a command like
>>>
>>> \font\pal = "Palatino:mapping=tex-text" at 12pt
>>>
>>> and whenever this font is used, XeTeX will pass the Unicode
>>> character sequence to be typeset (at the lowest level, after all
>>> macro expansion, etc.) through this mapping, and the standard TeX
>>> ligatures will work as expected.
>>>
>>> This was just implemented on Friday, and seems to be working well.
>>> It will be present in the next release of XeTeX (along with that
>>> OpenType ligature bug-fix, and perhaps another feature or two). Stay
>>> tuned! :-)
>
>
> With this, and Will's new .fd files, and utf8accents.sty ,
> we should be very close to having full backward compatibility
> with legacy LaTeX documents.
>
> By this I mean that it should be possible to apply a new selection
> of (Macintosh) fonts to old LaTeX documents, just by making
> minimal changes to which packages are loaded in the preamble.
>
> I'd urge everyone to try this with some of your old documents,
> and report back to the list on special cases that are not being
> processed correctly.
>
>
>>>> this sounds fantastic. Is this substitution scheme going to have a
>>>> syntax permitting the use of character ranges and maybe even
>>>> replacement patterns? So that one might be able to reorder
>>>> character positions saying something like (assuming syntax
>>>> resembling grep):
>>>>
>>>> ([U+0915-U+0939]) (U+0930) > \2\1
>>>>
>>>> I suppose one could spell out these substitutions for each case,
>>>> but it would save time...
>>>
>>> Yes. For more on the mapping engine (primarily focused on
>>> byte<->Unicode encoding conversion, but being used here to do
>>> transformations of a Unicode text stream), see:
>>>
>>> http://scripts.sil.org/teckit
>>>
>>> The software currently there is primarily for Windows, but I'll post
>>> OS X versions too.
>>>
>
> ... and this aspect should open up a whole new ball-game
> for handling transliterations.
>
>
>
>
> All the best,
>
> Ross
>
>
>
>>>
>>> Jonathan
>>>
>>> _______________________________________________
>>> XeTeX mailing list
>>> postmaster at tug.org
>>> http://tug.org/mailman/listinfo/xetex
>>>
>>>
>> -- chris ciotti <chris_ciotti at yahoo.com>
>> http://www.keyserver.net/en/
>> Key ID: 0x0BD2B97A
>> _______________________________________________
>> XeTeX mailing list
>> postmaster at tug.org
>> http://tug.org/mailman/listinfo/xetex
>>
> -----------------------------------------------------------------------
> -
> Ross Moore ross at maths.mq.edu.au
> Mathematics Department office: E7A-419
> Macquarie University tel: +61 +2 9850
> 8955
> Sydney, Australia fax: +61 +2 9850
> 8114
> -----------------------------------------------------------------------
> -
>
> _______________________________________________
> XeTeX mailing list
> postmaster at tug.org
> http://tug.org/mailman/listinfo/xetex
>
--
chris ciotti <chris_ciotti at yahoo.com>
http://www.keyserver.net/en/
Key ID: 0x0BD2B97A
More information about the XeTeX
mailing list