[XeTeX] Unicode space characters
maxwell
maxwell at umiacs.umd.edu
Wed Feb 3 21:22:40 CET 2010
I have been inserting zero width space characters (U+200B) into text at
punctuation marks where I wanted an optional break. (Examples: "foo/bar"
or "preposition+pronoun"--and no, the author does not want a real ASCII
space character there.) To my surprise, this did not make any difference
in the rendering in XeLaTeX; I still get an "overfull box."
It turns out there was a thread here just under a year ago with the same
subject line ("Unicode space characters"), where Jonathan Kew clarified
that "XeTeX has no special built-in knowledge about U+00A0 or the various
other Unicode space-like characters; it will simply 'print' them in
the current font." Which explains my problem. Tomáš Janoušek was going
to make a package to handle those appropriately. AFAIK, the package does
not yet exist (I'd be happy to find out I'm wrong).
In the meantime, is there a snippet of XeTeX code that I can insert into
my preamble that makes U+200B act as an optional line break? (I guess
U+0082 would also work, but I'm already using ZWSP.) The thread I
mentioned may say (see in particular the msg at
http://www.tug.org/pipermail/xetex/2009-March/012480.html), but if it does
I'm afraid I'm not enough of a Techie to understand. The original 'z at skip'
in the thread got interpreted as an email address and turned into 'z at
skip'. That I fixed (no guarantees this won't get munged as well!), giving
----------
\catcode`^^^^200b=\active
\def^^^^200b{\hskip\z at skip}
----------
which I copied into my prologue. But xelatex still complains:
-----------
! Undefined control sequence.
â->\hskip \z
@skip
l.10664 ...nd \urdu{تÙ} /tÅ«/; these pronoun+â
postposition
-----------
so apparently I'm still doing s.t. wrong.
BTW, I can't insert \discretionary in my input text in place of the
U+200B, because my texts are actually in XML, and get converted to xetex
using dblatex (and I don't want to put LaTeX commands in my XML, since
that's only one possible way of rendering the XML). I guess I could
convert ZWSP to \discretionary during the conversion from XML, but it seems
like defining the char in the preamble would be cleaner. Whatever way I do
this, I guess the ZWSP itself should not be preserved, since a given font
may not have a (null) glyph assigned to it.
Mike Maxwell
More information about the XeTeX
mailing list