[XeTeX] babel

Zdenek Wagner zdenek.wagner at gmail.com
Wed Mar 23 22:29:51 CET 2016


Hi Javier,

I am copying my reply to the cstex list because I am not autoritative for
Slovak and maybe I will not be precise enough. I am giving my commnents to
Czech (cs.ini), Slovak (sk.ini), and Hindi (hi.ini). Some comments are
common for all.

I do not understand the meaning of the encoding field. T1 and OT1 are font
encodings for use with 8-bit TeX, XeTeX is able to use UTF-8 or UTF-16 and
such fonts are available. IL2 (in Czech) was historically used in cslatex.
It is preserved for legacy documents but deprecated, unsupported in babel
and should be deleted. I know nothing about LY1. Before Unicode there
existed many private encodings for Devanagari, many web pages used it and
it was necessary to install a special font. Such fonts can still be found
but IMO there is no sense to support them.

I understand hyphenchar (should be the same as in English in all mentioned
languages) but do not understand the other hyphen* fields.

The minus sign in both Czech and Slovak should be –

The quotes in both Czech and Slovak are „ and “ (the closing quote has its
codepoint in Unicode but is rarely present in fonts, it is better to use
English opening quote which has the same shape).

In Czech (and maybe also in Slovak) the time separator is a period, in
sport results and time tables a colon is used.

Slovak: characters Ä Ď Ô Ť in index look strange to me, it should be proved
by a native Slovak speaker.

Hindi
====

See the note on the encoding above

A few misprints and missing items in the captions
bib = संदर्भ-ग्रन्थ (or संदर्भ-ग्रंथ)
contents - the version you have is one of the alternatives suggested by
Anshuman Pandey but most books I have bought in India contain अनुक्रम
part = खण्ड (or खंड)
page = पृष्ठ
proof = प्रमाण
glossary = शब्दार्थ सूची

cc, encl, and headto make no sense, I am probably the only man who writes
business e-mails in Hindi...

I have never seen abreviated months (a native Hindi speaker should help).
The only abbreviations for days of week I have seen at the Aligarh railway
station are:
Monday = सो॰, Tuesday = मं॰, Wednesday = बु॰, Thursday = बृह॰, Friday =
शुक॰ (or शुक्र॰, the plate was not clearly readable), Saturday = शनि॰,
Sunday = रवि॰. I would not be surprized if the ॰ punctuation were omitted.

[characters] ङ  and ञ are not used in Hindi, they should be removed from
index

frenchspacing – I am afraid that it has no sense in Hindi as well as other
Indic languages. The proper spacing was implemented in GNU Freefont (at
least for Hindi) and is activated automatically by language switching. The
rules are explained (in Hindi only, links to other languages switch to a
different text) at
https://hi.wikipedia.org/wiki/%E0%A4%B5%E0%A4%BF%E0%A4%95%E0%A4%BF%E0%A4%AA%E0%A5%80%E0%A4%A1%E0%A4%BF%E0%A4%AF%E0%A4%BE:%E0%A4%B9%E0%A4%BF%E0%A4%A8%E0%A5%8D%E0%A4%A6%E0%A5%80_%E0%A4%AE%E0%A5%87%E0%A4%82_%E0%A4%B8%E0%A4%BE%E0%A4%AE%E0%A4%BE%E0%A4%A8%E0%A5%8D%E0%A4%AF_%E0%A4%97%E0%A4%B2%E0%A4%A4%E0%A4%BF%E0%A4%AF%E0%A4%BE%E0%A4%81

punctuation: danda । and double danda ॥ should be listed as the most
important punctuation
quotes: either English double quotes or English single quotes are used
(depends on the preference of an author and/or a publisher)

number: Both Devanagari and Arabic digits are used, it is hard to say which
one should be he default

counters: the way how list items are numbered does not conform to the LaTeX
system. I have a normative document how it should be done, it is written in
Marathi and I probably have also a Hindi version. Unfortunately I have not
found time to implement it so far.



Zdeněk Wagner
http://ttsm.icpf.cas.cz/team/wagner.shtml
http://icebearsoft.euweb.cz

2016-03-23 19:31 GMT+01:00 Javier Bezos <listas at tex-tipografia.com>:

> Hi all,
>
> I'm working on a new version of babel, with a new way to define
> languages in a descriptive way, more than in a programmatic one (of
> course, the latter won't be excluded because it's still necessary).
>
> The idea is to create a set of ini file like those you can find on
>
>
> https://latex-project.org/svnroot/latex2e-public/trunk/required/babel/locales/
>
> They are tentative and some of them are incomplete. I'm working on the
> code to read and 'transform' their data, but in the meanwhile I'd like
> to improve the ini files. The first step in the roadmap is to provide
> real utf-8 strings for captions and dates with current styles so
> that they can be useable even without fontenc.
>
> Any help or comments would be greatly appreciated.
>
> [Crossposted to xetex and luatex lists.]
>
> Javier
>
>
> --------------------------------------------------
> Subscriptions, Archive, and List information, etc.:
>  http://tug.org/mailman/listinfo/xetex
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://tug.org/pipermail/xetex/attachments/20160323/8da91c14/attachment-0001.html>


More information about the XeTeX mailing list