[tex-live] kpathsea performance
Karl Berry
karl at freefriends.org
Sun Aug 26 23:50:09 CEST 2018
we do use .FIG and .AUX, Iôòùm not
sure if those are our custom includes or not
Those should be fine (.aux/.AUX is already a TEXINPUTS extension). It's
only using some predefined extension for another file type that could
cause trouble. Seems that is not at issue.
What might be interesting (and it makes sense), we have 3 directories that
we pass to tex to search through i.e. texinputs=dir1;dir2;dir3. In our case
Ah. I think that explains it. In 2017, kpse just looked for dir1/FOO.FIG
(and failed), dir2/FOO.FIG (and failed), and then succeeded with
dir3/FOO.FIG. Now, it fails for dir1/FOO.FIG and then readdir()s through
dir1 looking for "foo.fig" (strcasecmp-wise).
I considered keeping the old behavior, but a couple things mitigated
against it:
1) the current behavior is how it's always worked on Windows (because
Windows operates case-insensitively);
2) it seems at least as sensible to prefer an imperfect match in an
earlier directory as the converse.
(https://www.tug.org/texinfohtml/kpathsea.html#Casefolding-examples)
I admit there was also:
3) it was easier to fit the new feature into the existing code that way.
have many thousands of files in these directories,
It will take time to read such huge directories, yes. So although the
outcome here is unfortunate, I'm not sure there is anything to do to
improve it :(.
I'm not sure if it's feasible, but it occurs to me that you might be
able to speed things up (to a constant) by creating/maintaining an ls-R
file for the "tree" containing these huge directories.
It's also true that when I wrote the new bit of code, I assumed that
disk caching would pretty much take care of repeated searches (as it did
on all the systems I could check). With such huge directories, it's
certainly possible that the disk caching gets overloaded. The filesystem
type, available ram, etc., etc., are all going to be factors.
Best,
Karl
More information about the tex-live
mailing list