[XeTeX] An (almost) complete cyrunicode.tex

Sat Jun 30 10:55:30 CEST 2007

On Saturday 30 June 2007 03:49, Nikola Lecic wrote:
>
> Delete "specifical" and put "nowadays in use in ...". The division I
> proposed is purely mechanical and could be generated by a computer.
> "Ы" is not used in Balkans today, so it is specific for the East
> Slavic group _today_; that says nothing about its history. This
> division's aim is to easen corrections and checking, not to expose
> the history of Cyrillic.

Even if your division can be generated by a computer (which is also
not so obvious for me), it surely requires a user to be a Slavic
languages expert in order to understand where (s)he should look for a
particular character. 

> > - Historical letters, needed to support the Russian and Bulgarian
> > old orthographies;
>
> Only these two?

Well, I have mentioned Bulgarian just because it used one letter (namely
BIG YUS) which overwise would be placed into another category. Of
course I can add also Serbian/Montenegrian, as it is covered by the
same set of characters (i.e. cp1251 + Russian pre-1918 letters). But
that's really all, because other modern Slavic languages which use the
Cyrillic script got their contemporary alphabets together with their
identity; so, no old orthographies here.

> Well, here I can't see any logic :) Despite your own explanation of
> the origin of Cyrillic (which is well known to all us Cyrillic
> users), you argue that Russian+Slavic (isn't Russian Slavic?) is the
> most accurate division because (modern) Russian character set
> happened to be the first in Unicode.

This is because my classification is based mainly on purely technical 
reasons, so that any historic/semantic considerations are taken into
account at the second turn. If they allow to clarify subdivisions based
on Unicode and legacy standards, they are good; otherwise they are evil.
But Unicode separates 32 Russian letters from other Slavic letters, and,
whether you like it or not (BTW, some Russian users also don't like the
fact that YO is encoded separately) we have to live with it. Otherwise
imagine, say, a German user who whould argue that umlauts should be
listed among the ASCII latin letters rather than in the separate Latin
Supplement block...

> And as of this discussion itself... :) Maybe the better solution
> would be to have just Slavic/non-Slavic/Historic (out of use)
> letters.

Good point. I would just list Historic letters before non-Slavic
characters, because, above all, they are also Slavic, and even now
they are actually much more widely used (at least in TeX) than 
any non-Slavic Cyrillics. BTW, that was the initial mistake of
the whole X2/T2 series, which tends to include every possible
non-Slavic character but provides no support for historic
Slavic orthographies.

-- 
Regards,
Alexey Kryukov <anagnost {at} yandex {dot} ru>

Moscow State University
Historical Faculty