[tex-live] dvips removing characters from eps-file

David M. Jones dmj at ams.org
Mon May 5 19:11:10 CEST 2008


> Date: Fri, 2 May 2008 19:07:33 +0200
> From: "Zdenek Wagner" <zdenek.wagner at gmail.com>
> Cc: tex-live at tug.org
> 
> 2008/5/2 Struebing, Axel, le-tex <axel.struebing at le-tex.de>:
> > Dear all,
> >
> >  I have got a problem with dvips producing invalid postscript with a special
> >  kind of eps figures.
> >  In a minimal example just including one figure the problematic becomes clear.
> >
> >  Lines from the figure are altered after inclusion in the final postscript.
> >  These lines consists of binary data - IMHO - related to subsetted fonts.
> >
> >  dvips discards bytes with 0x04 (CTRL-D) unconditionally as verified in the
> >  source of dvips (output.c) and mangled the bytes contained 0x13.
> >
> 0x13 and 0x10 represent a newline, all PS rips must work with any line
> end, i.e. 0x13 is line end, not data. Similarly, 0x04 is job end mark
> for PS rip, i.e. it must not appear within a PS file.

Don't confuse the PostScript interpreter with the communication
protocol used to transmit data to a printer.  Section 3.8 of the 3rd
edition of the PostScript Language Reference describes the interaction
between the interpreter and the communications protocol.  See
especially the sections "End-of-line Conventions" and "Communication
Channel Behavior" on pages pages 74-76.

In brief,

*) The PostScript interpreter is required to accept CR, LF and CRLF as
   end-of-line markers, but only in contexts when it's supposed to
   care about end of lines.  In contexts where it's reading binary
   data, it does not do any end-of-line conversion and CR and LF are
   treated like any other byte.

*) 0x04 (EOT) is an end-of-file marker for some (but not all!)
   communication protocols that can be used to send data to a printer,
   but the interpreter itself doesn't treat it any differently from
   any other binary character.

> Using binary
> data in EPS is a pain, you must be sure that they are correctly marked
> by structure comments, dvips may then treat them corretly. 

The problem is that dvips only recognizes one of the DSC comments that
signal the present of binary data, namely, "%%BeginBinary".  It
doesn't recognize "%%BeginData:" or "%%BeginFont".  So, if you try to
include an EPS file that has an embedded PFB file, dvips will happily
delete all EOTs from the font definition as well as normalizing any
"line end" it finds there.  This is unlikely to be helpful.

Incidentally, when reading PFB files directly, dvips knows how to
re-encode them as PFA files, thus avoiding binary characters.  It just
can't recognize a PFB file inside an EPS file.

We ran into this problem last year and I patched dvips locally to
handle "%%BeginData" and "%%BeginFont" correctly (well, more
correctly, at least).  I never submitted the patches, largely because
I wasn't sure how robust they are.  I'll submit them to the dvips
maintainers now.

> However, PS
> printers do not read the structure comments and printing may fail.

PostScript interpreters don't need the structure comments because they
already know from the code being executed when to expect binary data.

David.


More information about the tex-live mailing list