[XeTeX] New Hyphenation for Phonemic Orthographies for English?
Jonathan Kew
jonathan_kew at sil.org
Fri Apr 20 00:00:57 CEST 2007
On 19 Apr 2007, at 8:04 pm, Kenneth Reid Beesley wrote:
>
> My current project sets Deseret Alphabet and International
> Phonetic Alphabet in parallel columns, and I need to define
> hyphenation for both. Too many long words are splaying over
> the right margins. The language is English, but written in
> two separate phonemic-alphabet orthographies. In the future,
> might need hyphenation for the Shavian Alphabet as well.
>
> Q1: What's the best approach to defining possible hyphenation
> points for English words written in these orthographies? The
> Deseret
> Alphabet and Shavian characters lie in the supplementary Unicode
> space,
> so whatever scheme I adopt would need to handle supplementary
> characters.
>
> Q2: Is it possible to specify globally that the end of an em-
> dash is
> a possible hyphenation point in text?
I'm not sure if anyone has yet tried to define hyphenation patterns
for languages using supplementary-plane characters in XeTeX, but
offhand I can't think of any reason it shouldn't work. Note that
defining hyphenation patterns in general is a fairly technical job,
though. In any case, the first question would be what the rules are
supposed to be for hyphenation of English in these various
orthographies; is there any established practice to follow, or will
you be making up your own rules? Based on syllabic or morphemic
boundaries (or something else)?
Re Q2, there's a simple answer; just include the setting
\XeTeXdashbreakstate = 1
in your document. This tells XeTeX to allow line-breaks after the
Unicode en-dash and em-dash characters, just as standard TeX normally
does with "--" and "---" ligatures.
(I just noticed that Will's "XeTeX reference guide" claims that this
parameter is set to 1 by default, but I don't think that's true, at
least in the standard TeX Live 2007 configuration.)
JK
More information about the XeTeX
mailing list