[XeTeX] Converting legacy encodings to utf-8
Firmicus
firmicus at ankabut.net
Wed Jul 12 21:27:33 CEST 2006
Will Robertson wrote:
> But perhaps it's too weighed down with Aleph assumptions/dependence.
> Note that I suspect a sort of equivalence between OCPs and TECkit
> mappings...
I'd be delighted if someone could confirm that! Up to now I only wrote a
few very simple TECkit mappings, and my initial impression was that
TECkit's functionality is not as ambitious as that of Omega translations
processes (OTP). But perhaps I should just read the TECkit
documentation... ;-)
Last year I wrote a set of OTPs to convert ArabTeX input to UTF-8,
admittedly not a simple task. The results were yet not perfect, but
pretty decent. Since then I more or less abandoned Aleph/Omega, at least
for my own practical purposes: too many bugs and headaches.
Now if it indeed turns out that TECkit provides the equivalent
functionality or OTPs, I would be willing to rewrite ArabTeX -> UTF-8
TECkit mappings for the benefit of XeTeX's users. (Despite the
availability of Unicode bidi editors nowadays, there are still
compelling reasons why one -- in particular linguists, orientalists, or
historians of science like myself -- would prefer to input a language
such as Arabic by means of an intelligent ASCII encoding convention. But
this is another story.)
BTW I have here a quick-and-dirty Perl script for converting traditional
LaTeX input to UTF-8 and covering about 650 glyphs. It is based on the
data in the utf2any programm, except of course that the conversion is
done in reverse. I can provide it to anyone who might be interested. I
guess such a tool, once extended and improved, could be "shipped" along
with XeTeX eventually.
Regards,
François Charette
More information about the XeTeX
mailing list