Line lengths with polyglossia in Cantonese and Japanese ?

Ken Moffat zarniwhoop at ntlworld.com
Fri May 31 21:22:15 CEST 2024


On Thu, May 30, 2024 at 11:52:39PM -0600, Max Chernoff wrote:
> Hi Ken,
> 
> (unrelated to your question, but hopefully still helpful for the other
> parts of your project)
> 
> On Fri, 2024-05-31 at 00:54 +0100, zarniwhoop at ntlworld.com wrote:
> > I've intermittently been documenting various
> > TTF and OTF fonts for use on Linux: how they look, which codepoints
> > they contain, and from that which modern languages are covered -
> > showing at a minimum the alphabet [...]
> >
> > To list all the possible codepoints I care about for the various
> > languages I use XeLaTeX.  [...] Adding one codepoint at a
> > time from yet another font which had patchy overage convinced me
> > that XeLaTeX, not LuaLaTeX, was the way to go (much quicker, and
> > did not fail if nothing found in the font).
> 
> I have two TeX Stack Exchange answers that might be helpful to you, with
> code at:
> 
>     (1) https://tex.stackexchange.com/a/707031
> 
>     (2) https://tex.stackexchange.com/a/715598
> 
> and sample output at:
> 
>     (1) https://tug.org/~mseven/files/all-characters.pdf
> 
>     (2) https://tug.org/~mseven/files/georgian-fonts.pdf
> 
> These don't help with the linebreaking at all (they both just typeset a
> grid of characters), but they might be helpful for the alphabet and
> codepoint coverage portions of the project.
> 
> Both answers use LuaTeX (or LuaMetaTeX), but they're pretty quick to run
> after the caches are built: (1) loads all 231 Noto fonts and outputs
> 83 020 distinct characters in under 5 seconds, and (2) loads literally
> every single font on the system and outputs ~8 000 characters in under
> 30 seconds. The first run is quite painful though -- up to 30 minutes
> for (2) -- but after that, the caches will be available for all future
> runs/documents.
> 
> Thanks,
> -- Max
Hi Max,

(sorry to Werner, I tend to omit people's names when replying)

Thanks, that is interesting but some years too late.  I started the
project in, I think, 2011 (maybe it was earlier).

Until I finish the CJK updates I'm thinking about, on this one
system my installed fonts will change.  At the moment I have only
486 font files - I avoid installing variable fonts and weights I do
not need.

At first I think it was all using just bash and libreoffice (in
those days libreoffice indicated that a glyph was missing, instead
of picking it from another font on the system).

Major additions in 2016, 2017 and it looks as if I had switched to
using xelatex by 2018.  Some small revisions in 2019, 2020, 2021.
Then I left it.  Last year I returned to it - I'd come across some
new fonts I wanted to look at, and discovered several other fonts
had newer versions and/or changed locations.

Added some new or updated fonts last year, then started working
through my example L/C/G Serif fonts to summarise their weights etc
in example lipsum PDFs and add details of weights and small caps to
the files showign which languages I think a font supports).

Came back to it a few weeks ago to do the L/C/G Sans fonts.
Discovered more typos and more lack of capitals for proper nouns.
Did that, eventually started using polyglossia where it helped, and
have now started to revise the L/C/G Serif fonts to do the same.

With hindsight, there is a lot of unnecessary standard text in my
files, but it's too late to change them all now.

Meanwhile:

· Thinking ahead to CJK (I found some draft W3C guidelines but was
  hoping to mainly use polyglossia).

· Documenting the Sans small caps and producing the updated Sans
  lipsum file(s).

· Possibly revising the Greek dummy text (cut-down transliterations
  of lorem ipsum, but hacked around to be similar shortish lengths
  and with at least each lowercase letter of unaccented monotonic
  Greek, and some examples of tonos and dialytika, in each item).

· Thinking about getting a new hosted site and provider, with https.

· Wondering if I should rework my current presentation (the main
  table is very wide).

So I'm unlikely to look at those links at the moment.

ĸen
-- 
When one person suffers from a delusion, it is called insanity.
When many people suffer from a delusion it is called a Religion.
 -- Robert M. Pirsig, Zen and the Art of Motorcycle Maintenance.


More information about the tex-live mailing list.