[XeTeX] help with hyphenation
Jonathan Kew
jonathan_kew at sil.org
Sat Feb 2 19:03:33 CET 2008
On 2 Feb 2008, at 5:45 pm, ashinpan at gmail.com wrote:
> Hi! all
>
> I having been using XeTex as part of TexLive 2007 on Ubuntu (Gutsy
> Gibbon). My documents are mainly in English but Pali and Sanskrit are
> often embedded in them.
>
> The problem I am facing is some words are getting randomly hyphenated
> before linebreaks. I tried to change the language to Welsh, of which I
> have no language file, to force manual hyphenation but no use. Again I
> tried to remove the package Babel, but the problem still persists.
>
> Below is the tex source that I use. I have also attached a PDF file
> that XeTex produces on my machine from that source. In that PDF file,
> I see the following unnatural mid-line hyphenations, provided together
> with respective line numbers:
>
> design-ing (1)
> ope -rat- ing (2)
> pro-ducts (5)
> fly-ing (6)
> pos-sesses (8)
> under-stand(9)
> mak-ing (18)
> natu-ral (32)
>
> I hope someone would kindly help me out.
Looking at the TeX source, I see that these are present in the input
text as "soft hyphen" characters, U+00AD. Perhaps your editor is
inserting these automatically, and you don't see them on screen while
editing?
XeTeX doesn't inherently "know" anything special about the U+00AD
character, so they're simply being printed in the current font.
I think you want to do one of three possible things:
(a) remove the "soft hyphens" from the input text, and prevent your
editor inserting them; just rely on TeX's hyphenation patterns where
necessary
(b) if that's difficult, you could cause TeX to ignore them by
"defining them away":
\catcode"AD=\active \def^^ad{}
(c) if you want them to act as discretionary hyphens, overriding
whatever hyphenation points TeX might find automatically in those
words, then define them as TeX discretionaries:
\catcode"AD=\active \def^^ad{\-}
I might add (c) as a default definition to the xetex and xelatex
formats, as it seems like the most logical thing to do with U+00AD.
But in general, you shouldn't need these in your text at all.
HTH,
JK
More information about the XeTeX
mailing list