[tex-live] texdoc in luatex

Norbert Preining preining at logic.at
Mon Jul 2 19:21:56 CEST 2007


Hi all,

puhhh, many thing are floating around. In fact it would be nice to
discuss all this stuff in person, which would make it much easier!

On Mon, 02 Jul 2007, Florent Rougon wrote:
> for each doc file and the corresponding package. I'd very much prefer be
> provided with a simple file that tells me, for each CTAN package, the
> path of each doc file as installed in TL. And I think Norbert can
> provide me with that.

Well, this should work already now more or less, at least for those
packages where we know the TeX Live name <-> TeX Catalogue name
translation. Let's put it this way, the Catalogue has a field
	<texlive location='foobar'>
which gives the respective TLPOBJ (ex tpm) to which is corresponds.
Unfortunately we don't have a back mapping built into TeX Live, or at
least only partly into the ctan2tl script. All this could be simplified
in some way (but who? is writing all the code!).

So in principle we can do:
	CTAN package -> get texlive location ->
	-> get TLPOBJ from texlive.tlpdb -> get docfiles from this
I can write you a perl script in 3 minutes that does this. (spits out
list of files, nothing else)

> Even better would be if this file could carry with it the metadata about
> each doc file, so that I could easily take advantage of the 'language'

Well, I guess I have to write a Perl Module for accessing the TeX
Catalogue Data. That would be much needed anyway since our TLP Source
files now contain *absolutely* the minimals, no descriptions etc, this
should all be taken from the Catalogue at TeX Live Database update time.

Maybe tomorrow ;-) With this access module it should be easy to extract
information from the catalogue.

Florent, can you send me what you want? Ie format of the output:
Something like
	$ get-florent-stuff "ctan package name"
	...
	format of return stuff to be specified
	...
and I can write a perl script doing this (after I wrote the Perl module
for the catalogue. Ohhh how i HOPED to get rid of XML when I changed the
infra structure of TeX Live to use plain text files, it is sooo much
simpler to handle).

> I have the impression that this ML is the right place for TeX Live and
> that some people from the CTAN team are already participating, but maybe
> that's not enough.

I was most scared that we discuss something without the input of CTAN
people. Doing anything without the CTAN is completely useless, because
they are so included in the whole procedure ...

> We could invite people on c.t.t. do join the discussion, but that could
> become messy, dunno. Another possibility is to devise a proposal here
> and then post to c.t.t. for comments, objections, etc.

No, you are right, it will get a mess. Those interested are already
here. Maybe the Catalogue and the CTAN people can invite some more. 

> > This is already done for many packages, see ctan2tl script, and below
> > ;-)
> 
> Grrrmpf. OK. I can propose a solution anyway.

Shouldn't you be happy ;-)

>     <tag>
>       field::mathematics
>     </tag>
> 
>     <tag>
>       macropackage::latex
>     </tag>
> 
>     <documentation details='Foo User Guide' language='en'
>                    href='ctan:/macros/latex/contrib/foo/doc/index.html'/>
>   
>     <documentation details='Foo Frequently Asked Questions' language='en'
>                    href='ctan:/macros/latex/contrib/foo/doc/foo-faq.txt'/>
> 
>   </entry>
> 
> >From this metadata, your ctan2tl script would generate for me a file
> containing something like that:

Actually I would prefer to keep the ctan2tl clean from this, it is
already TOO complicated to actually be read.

All the information *is* present in the texlive.tlpdb, the TeX Live
Database. From this and the Catalogue data we can generate whatever you
want, see above.

> > inclusion. We are working on this to update it for new packages, but
> > of course it still does not work for all.
> 
> A lot of work, for sure...

And in need of a *good* programmer like you to help a bit ;-))))

On Mon, 02 Jul 2007, Florent Rougon wrote:
> The downside when it is in the catalogue is that it isn't accessible to
> packages not yet in CTAN: packages that you would install in /usr/local
> for instance. OTOH, if there is a known place in the TDS to put the
> metadata of each package, then such third-party packages can be
> trivially found by the tool I'm thinking about if installed properly.
> 
> There would be for instance
> 
>   /usr/share/texmf/metadata/g/geometry.xml
>   /usr/share/texmf/metadata/h/hyperref.xml
>   etc.

Well, we dicussed this for quite some time when we rewrote the TeX Live
infrastructure. We had this TPM file, and of course those could be
enrichted with other information. Out of various reasons we wanted to
separate real content from generated content, and NOT to have 2000+
separate files (the installers had problems because they needed to read
all those files). So the new infra has package source files providing
the absolute minimum on information, and from this on can generate
package object files containing: metadata like installation
instructions, descriptions (currently missing, should be done from
Catalogue), list of files in 4 categories (run, bin, src, doc) and the
respective sizes. All these textual representations are concatenated to
the texlive.tlpdb.

For you easy: the Packages files is *exactely* the same, every package
consists of one "paragraph" etc. I guess you know what I mean (in fact
it was modelled after that).

> and each of each of these files would contain relative paths from the
> base of the TEXMF tree (here, /usr/share/texmf/) for each documentation
> file belonging to the package.

this is already there, but *including* the "texmf name", ie
texmf/texmf-dist/texmf-doc.

> containing paths relative to /usr/local/share/texmf, and that would
> allow mypackage to automatically register itself with the
> cataloguing/documentation system I'm proposing here.

Optimal would be that these files can be *generated* from what is there,
so the Catalogue and the installed files.

You might again ask why: package authors contact CTAN and update the
Catalogue, so these are the central points. ANYTHING which is not generated
from these information will get eventually ignored, outdated, wrong, etc.

So any *additional* structure/files/etc we create should either contain
*really* new stuff or should be generated.

Example: Before we had in the tpm files:
- list of files      generated from the svn repository with some scripts
- descriptions       now and then updated from the catalogue or by hand
- licenses           now and then updated from the catalogue
- patterns for files        manually maintained
- installation instruction: manually maintained
- ...
Now all the stuff that should be updated was always out of date, wrong,
conflicting etc. So it is better to:
- change the *source* of the information (catalogue mostly)
- generate the information if necessary.
This is done at the TLP Source -> TLP Object transition. A typical
TLPSRC file now contains nothing but
	name foobar
(even the name could be generated from the file name, but that was too
crazy), which means: include all files under directories named "foobar" 
in the texmf-dist tree. Of course you can override the default behaviour
and specify what files you actually want (from the about 1500 source
files, 173 have patterns in it, and most of them are bin-* files which
NEED the patterns anyway, so the automatic generation works quite good).

If we now create another place which can become out of date this is
counter productive. Therefore I proposed to somehow include the
information in the catalogue.

===========================

So since I hate long discussions and like to come up with solutions,
what do we actually have to do to get these things working, let's call
it a work plan:

This is preliminary, please extend it with your local knowledge, ie, if
you have suggestions for CTAN, add it, or for Catalogue, or whatever.


0. Preliminaries
1. CTAN side
2. Catalogue side
3. TeX Live side
4. Various

ad 0. Preliminaries
-------------------

Create a tag system, or vocabulary, or whatever. Any specification
should probably be placed into the subversion of the Catalogue. Or CTAN.

ad 1. CTAN side
---------------

Fix how we want to get the information from the package authors. Either
via a xml file in the upload, or via the experimental web upload
feature. Or (best) both.

Write support for the tagging system.

ad 2. Catalogue side
---------------------

After we have this and the tag system specified, extend the Catalogue
DTD to include the missing bits (if any) for the meta data information.

Write some override mechanism?

ad 3. TeX Live side
-------------------

Merge the Catalogue information with the information about installed
docfiles for a package.

Generate some doc index file in whatever format.

Write some documentation browsing program.

ad 4. Various
-------------

Tag the most important packages by hand to get the system started.

Convince people to use the system ;-)



Best wishes

Norbert

-------------------------------------------------------------------------------
Dr. Norbert Preining <preining at logic.at>        Vienna University of Technology
Debian Developer <preining at debian.org>                         Debian TeX Group
gpg DSA: 0x09C5B094      fp: 14DF 2E6C 0307 BE6D AD76  A9C0 D2BF 4AA3 09C5 B094
-------------------------------------------------------------------------------
HALIFAX (n.)
The green synthetic astroturf on which greengrocers display their
vegetables.
			--- Douglas Adams, The Meaning of Liff


More information about the tex-live mailing list