[XeTeX] Roman Numerals as stylistic alternatives
Peter Baker
psb6m at virginia.edu
Mon Jun 20 00:03:26 CEST 2011
It seems to me (having worked with OpenType fonts for some years) that
while it might be possible to make an Arabic-->Roman converter at the
font level, that's going to be one of the most inefficient possible ways
to handle it. With OT you can make a set of rules that says
Here's a 1 followed by two digits; substitute a C;
Here's a 2 followed by one digit; substitute XX
Here's a 3 followed by a something other than a digit; substitute III.
But it can't understand numbers the way a programming language can do.
If you want to be able to write XC for 90, the task gets somewhat more
complex, because OT definitely can't say
For a number in the range 90-99, do the following . . .
Surely a programmatic solution would be better; and (La)TeX has an
understanding of roman numbers built in. With a little Googling I was
able to come up with this file, which works:
%&program=xelatex
%&encoding=UTF-8 Unicode
\documentclass[11pt,letterpaper,twoside,openany]{book}
\usepackage{fontspec}
\makeatletter
\newcommand{\rmnum}[1]{\romannumeral #1}
\newcommand{\Rmnum}[1]{\expandafter\@slowromancap\romannumeral #1@}
\makeatother
\begin{document}
There are \rmnum{123}\ fish in the sea.
And there are \Rmnum{5123}\ leaves on the tree.
\end{document}
But I don't know how to make a file that will use /ActualText. Maybe
someone here can explain that.
Peter Baker
On 6/19/11 4:43 PM, Ross Moore wrote:
> Hello Enrico,
>
> On 20/06/2011, at 5:42 AM, enrico.gregorio at univr.it wrote:
>
>> What the OP wants is that "CXV" is stored as a unique glyph representing 115.
>> Maybe this can be done by reserving, say, five thousand slots in Unicode to
>> contain the numbers from 1 to 5000 in Roman form that are built from the basic
>> digits, embedding in the font (or in the typesetting engine) the algorithm for building
>> them from the Western/Arabic representation.
> No.
> In the PDF ISO standard, you have the option of using /ActualText tagging.
> The PDF would contain a portion of the page contents stream, such as:
>
> /Span<</ActualText(115)>>BMC .... (graphics to position and produce
> the letters 'C' 'X' and 'V' ) ... EMC
>
> Now *any* attempt to select any portion of the visible string "CXV"
> is supposed to result in the whole string being included when copying.
>
> The problem is that not all PDF browsers are fully conformant, so this
> behaviour may not be what you actually get with a particular piece of
> software. (BTW, Apple is one of the biggest offenders.)
>
>> This might be done in two passes:
>> represent the number using the codes for Roman numerals and start a ligaturing
>> process.
> Trying to do it character by character at the font level doesn't seem
> overly practical to me. The concept is the number "123" but represented
> in a non-standard way. The use of /ActualText tagging seems to be much
> more helpful to readers, and also to other software that tries to
> extract the meaning being represented with a PDF, for whatever purpose.
>
> Note that ISO PDF also has an alternative method of tagging.
> E.g.
> /Span<</Alt(123)>>BMC .... EMC
> Screen-readng software is meant to use the /Alt tagging.
>
> And both /Alt and /ActualText allow multiple values having been preceded
> by a /Lang tag, so that the actual vocalization generated by the
> screen-reader can be adjusted for different languages --- the document
> author normally would provide this, but a sophisticated PDF browser
> plug-in might be programmed to produce a translation on-the-fly.
>
>> Actually, Roman numerals are mostly used when the numerical information is
>> almost irrelevant as such. Nobody uses the "XIV" in "Louis XIV" to perform
>> calculations. That's just a different way of writing "quatorze".
> Right. So /ActualText tagging can support this distinction in meaning.
> It is *not* intended to support calculations --- that is the domain
> of "Content Tagging" using MathML.
>
>> I see it just as the ability to copy "quatorze" from a text and paste it into a
>> worksheet cell accepting numbers to get 14. In the case of Roman numerals
>> it may be simpler, of course. But is it useful?
> Most certainly it is useful.
> It is part of the way of the future for smart PDF documents.
>
>
>> Ciao
>> Enrico
>
> ------------------------------------------------------------------------
> Ross Moore ross.moore at mq.edu.au
> Mathematics Department office: E7A-419
> Macquarie University tel: +61 (0)2 9850 8955
> Sydney, Australia 2109 fax: +61 (0)2 9850 8114
> ------------------------------------------------------------------------
>
>
>
>
>
>
> --------------------------------------------------
> Subscriptions, Archive, and List information, etc.:
> http://tug.org/mailman/listinfo/xetex
More information about the XeTeX
mailing list