[texhax] Blank first page problem (how to remove?)

Johnny yggdrasil at gmx.co.uk
Tue Jun 7 19:47:04 CEST 2011


Pierre MacKay <pierre.mackay at comcast.net> writes:

> On 06/05/2011 01:30 PM, Reinhard Kotucha wrote:
>
>     On 2011-06-05 at 14:54:34 -0400, Thomas Schneider wrote:
>     
>      > > > There are three bogus bytes at the very beginning of the file: 
>      > ...
>      > 
>      > > So it seems notepad in Windows have done some formatting of the file
>      > > formatting which I didn't notice.
>      > 
>      > That's yet Another reason to add to the pile to avoid Windows.  Under
>      > Unix with vim you would have seen those characters.
>     
>     Are you sure?  It's only *one* character and I doubt that there is a
>     font which has a glyph for it.
>     
>     Regards,
>       Reinhard
>
> I quote from page 105, of The Unicode 5.0 standard:
>
>      Because the UTF-8 encoding schene already deals in ordered byte sequences,
> the UTF-8 encoding scheme is               trivial.    The byte ordering is
> completely defined by the UTF-8 code unit itself. 
>
>     While there is obviously no need for a byte order (= bigendian vs
> littleendian ) signature when using UTF-8, there       are occasions when
> processes convert UTF-16 or UTF-32 data containing a byte order mark into
> UTF-8. When             represented in UTF-8, the byte order mark turns into
> the byte sequence <EF BB BF>.  Its usage at the beginning       of a UTF-8 data
> stream is neither required nor recommended by the Unicode standard.
>
> Notepad has, in typical Microsoft behavior,"made it better for you" by
> including a totally unnecessary byte sequence that properly designed software
> would have left out.
>
> No, there is certainly NOT a font character corresponding with this byte
> sequence.
>
> The specifications for UTF-8 are absolutely brilliant, and are followed by all
> Unix/Linux applications that I have encountered..  Perhaps someday Microsoft
> will enter the 21st century too
>
> Pierre MacKay

I feel compelled to extend a thanks to all more knowledgeable users for
expanding the subject and sharing their knowledge on these workings! I
for one have learned a whole lot about the encoding standards, and, in a
less humble way, to interpret log-files and post more appropriate
inquiries to the list.

My sincere thanks!
-- 
Johnny


More information about the texhax mailing list