[tex-live] extractbb can not read the pdf file generated by mutool

ABE Noriyuki abenori at math.sci.hokudai.ac.jp
Tue Jan 5 19:39:35 CET 2016


Dear all,

Maybe I don't understand the problems...

> First, mfgets() function can not be used to read a line anymore if you
> are willing to allow arbitrary amount of white-space characters to
> appear; it returns null-terminated strings while PDF spec. allows
> null characters to appear anywhere within a line since they are
> simply kind of white-space characters in PDF.
Do you mean that, for example, "x ref" or "xre\0f" should be also 
accepted? Otherwise, I think mfgets works.

> Next, garbage after valid data is not detected since it simply does
> sscanf() and does not check the data unread.
You want to check that the data is valid?

Could you tell me what extracbb should do? Currently (if I'm not 
mistaken), with my patch, extractbb allows

* empty line (a line containing only spaces and comments.)
* comment or spaces after xref entry and "xref"
* spaces before "xref" or "trailer"
* spaces before and after <first> <size> and comments after it

and it may (not always) give warning if it is not like this.


2016/01/05(Tue) 00:06:19, Shunsaku Hirata <shunsaku.hirata74 at gmail.com>:
> Hi,
> 
> I've looked the code and proposed patch, and found two problems.
> 
> First, mfgets() function can not be used to read a line anymore if you
> are willing to allow arbitrary amount of white-space characters to
> appear; it returns null-terminated strings while PDF spec. allows
> null characters to appear anywhere within a line since they are
> simply kind of white-space characters in PDF.
> 
> Next, garbage after valid data is not detected since it simply does
> sscanf() and does not check the data unread.
> 
> The second problem is also in the original code. But, If you think
> that arbitrary amount of white-space characters (including empty
> lines) can be inserted between each pieces of xref data and xref
> subsections, as oppose to current implementation, you need to do
> more works to be consistent.
> 
> Thanks,
> 
> Shunsaku Hirata
> 
> 2016-01-04 11:21 GMT+09:00 Shunsaku Hirata <shunsaku.hirata74 at gmail.com>:
> > Hi,
> >
> > More permissive approach would make
> > implementation simple and clear.
> >
> > I will try some other implementation later.
> >
> > Shunsaku Hirata
> >
> > 2016-01-04 7:40 GMT+09:00 Akira Kakuto <kakuto at fuk.kindai.ac.jp>:
> >> Hi Noriyuki, Karl,
> >>
> >>> Akira, wdyt?  Would you like to install it if it seems ok to you too?
> >>
> >>
> >> S. Hirata, the author of the dvipdfmx, says that he will give a
> >> patch by modifying Noriyuki's a little.
> >> I'm waiting for the patch.
> >>
> >> Thank you,
> >> Akira
> >>
> >
> >


-- 
北海道大学 大学院理学研究院数学部門
阿部紀行(あべのりゆき)


More information about the tex-live mailing list