[completed 2009-10-21]
Joseph is a
relatively new arrival to the world of TeX development, particularly working in
the area of improving LaTeX support for chemistry. He also is a member of the
LaTeX3 team.
Dave Walden, interviewer: Please tell me about yourself independent of TeX.
Joseph Wright, interviewee: Work-wise, I'm a research chemist in a university: I work in the lab day-to-day for my boss, Professor Chris Pickett (www.uea.ac.uk/che/people/faculty/pickettc). I've been at my current university (UEA in Norwich) for about four and a half years; before that, I did two years in Southampton, and before that my PhD in Cambridge. So I've moved around the UK a bit, but not further afield (yet, at least).
Outside of the lab, I play the clarinet and recorder a bit (never enough practice time). I'm also currently starting a basic Italian course with the Open University. So I keep busy!
DW: Please tell me more about what it means for you to be a research chemist and the extent to which there is computer involvement these days.
JW: Most of my work in the lab is very much “wet” chemistry: white coat, flasks of coloured solutions, the odd fire, etc. I also work with some physical methods, where the data recording is all done with a computer nowadays. That's a relief, as I've used chart-recorders and they are not much fun.
Also, there is a lot of computer involvement in chemistry, and we do some basic data analysis on the PC (spreadsheets and so on). Some of my colleagues do simulations, but that is a full-time job on its own, so I just hear about the results!
DW: How and when did were you first introduced to TeX et al.?
JW: I'd heard about LaTeX while I was writing up my PhD thesis, but despite some idle talk in the lab never really looked into it properly. When I moved to Southampton, there was a guy in the next lab using LaTeX to write up with. So I decided to really “take an interest” and get set up. The first really “big” thing I did with LaTeX was a final report on my research in Southampton for my boss there. There was a bit of a learning curve, and I soon found that there were “gaps” in the support for chemists.
DW: Please tell me a bit about what you saw as the “learning curve” for using LaTeX for a significant report writing project.
JW: Well, to start with I had to get the idea that a TeX system and a TeX editor were different things! Once I got going, I soon realised that the basic classes need lots of add-ons to make things work smoothly. In need of some guidance, I bought the Guide to LaTeX and The LaTeX Companion early on, and did quite a bit of reading. Once I'd got a set of “basic” packages sorted out, things worked much more easily.
DW: How do chemists use LaTeX?
JW: Most of what chemists want to do is pretty basic, but there are a few style things that we do that are different from other people. There are some very clever packages on CTAN for chemists: probably the mhchem package is the highlight. It can turn \ce{H2 + O2 -> H2O} into a nicely formatted equation.
One of the most basic things to do is format a bibliography (or references section as I'd think of it) using BibTeX. The BibTeX styles for chemistry available when I started using LaTeX were a bit out of date and buggy, so one of the first “programming” things I did was do my own improved versions.
Perhaps the most obvious thing about chemistry documents is that we like to draw the structures of molecules. There are some very clever packages on CTAN to do that from code, but like most chemists I think that the commercial ChemDraw package is the way to do things. That gives me EPS files for use with LaTeX, but there are then two important issues. First, we call these pictures “schemes”, so a float type for them is a good idea. It turns out that getting new floats to look exactly like kernel ones is not quite as simple as you might imagine. I ended up writing chemscheme partly to do exactly that.
The second graphical thing we do is give everything a reference number, then include that in the text. Obviously, you can soon have lots of numbers, so some automation can be handy. The existing bpchem and chemcompounds packages provided a way to track numbers in the text (in the same way as the \label mechanism), but that didn't help inside graphics. So the other thing chemscheme does is wrap up some psfrag code in some sugar-coating to make that easy inside schemes: a temporary “marker” is replaced by the proper number during a LaTeX run. It's not complex, but when you write lots of schemes it saves a lot of time.
DW: It's quite a step from using LaTeX to developing packages. Most people who see a gap don't do that. What motivated you to get so deeply involved with LaTeX?
JW: Mainly people asking me questions, and agreeing to take things on! I started reading comp.text.tex soon after I got going with LaTeX, and some other chemists there have contacted me over the years. As I said, I started off programming in TeX by doing some BibTeX work to fix bugs in existing styles. I also put a few very basic chemistry-related shortcuts on CTAN as the first edition of my rsc package. That led to a question from someone about including schemes in LaTeX, and doing the auto-numbering trick. So I wrote up some code I'd hacked together as chemscheme, and sent it to CTAN.
The first “bigger” problem I took on developed into notes2bib. Again, this is chemistry-related: we tend to put references and notes together in a single numbered list. Doing that automatically wasn't possible, and someone asked on c.t.t. about it. I'd got some ideas, so (foolishly) said I'd have a go. My biggest package to date, siunitx, started in much the same way. There was a question on c.t.t about a minor bug in SIunits, one thing led to another and I ended up volunteering to do a complete re-write.
At each stage, I've learnt a lot from the programming to tackle the problem I've taken on.
DW: I'm immediately going to try using notes2bib. I have written for a technical journal that combines notes and references in a single number list, and I used this format for a book I self-published (using some clunky LaTeX machinations to make it happen).
I see from www.ctan.org/tex-archive/macros/latex/contrib/siunitx/siunitx.pdf that the siunitx package is a set of tools to help authors to typeset numbers and units in a consistent way, that is, to deal with the International System of Units. How did your re-write proceed? It also occurs to me to ask if you are familiar with the programming language Frink which I was reading about recently in connection with the MITH}enge calculations.
JW: siunitx very much builds on the ideas from earlier units packages. As I said, I started the project when a bug came up in SIunits. It turned out that the original maintainer didn't have time to look after it any more, so I stepped in. I then asked on c.t.t about “what might be useful”. I got a long list, and had already decided to integrate SIunits and SIstyle into a new package. I took a lot of stuff (for example, the font detection) from SIstyle: there is ironically less from SIunits. A lot of the number processing code is adapted from numprint, but as with most of the code I've added a lot of flexibility. What is unique in siunitx is the “unit process” code, which turns symbolic units (“\joules\per\mole\per\kilogram”) into typeset symbols ("J\,mol$^{-1}$\,K$^{-1}$"), with options on how it looks.
At the moment, I'm working on siunitx version 2. That is basically all new code: I've learnt a lot of LaTeX programming since the first version, and there is lots to improve. I also need more flexibility in the low level code, as there are lots of new options I've been asked to add.
I have to admit that I've not heard of Frink before: I'll certainly take a look.
DW: How much of a computer programmer were you before you dug deeply into LaTeX packages, etc.?
JW: Not really much at all. I've dabbled with various things in the past, but never really beyond the stage of reading a “Learn to program …” book. On the other hand, I've always been good with computers, and willing to try stuff out, and have read the programming section of a PC magazine (PC Plus) for a long time. So I had a feeling for some concepts, I guess.
DW: You had two papers published in the most recent issue of TUGboat (30:1, 2009, pp. 107–122) — one on LaTeX3 and one of implementing key-value input. I presume you became involved with key-value inputs as a way to make your packages easier to use; is this correct?
JW: Yes, very much so. I got started with key-value stuff quite early on, but understanding what is going on can be a challenge. I tend to think that the programmer should make things easy for the end user, and the number of options I needed for siunitx (even early on) meant that I really needed key-value input.
DW: You wrote this tutorial introduction to implementing key-value input with a co-author. What motivated you to write this tutorial, and what was Christian Feuersänger's involvement?
JW: siunitx currently uses xkeyval, which has lots of clever ideas but a very complex interface. It had taken me a long time to get to grips with it, and to understand what I needed to get everything working. So after I released v1.0 of siunitx, I wanted to put some guidance down on paper for other people. To make sure I covered most of the available solutions, I asked on c.t.t. if I'd missed anything (at that stage, I'd covered keyval, xkeyval, kvoptions and kvsetkeys). Someone (Ulrike Fischer, I think) pointed me to the pgfkeys package. As I wasn't familiar with it then, I contacted Till Tantau, and he pointed me to Christian.
The division in the paper is basically that I wrote most of the keyval-based stuff, and Christian wrote about pgfkeys. Of course, we both read the whole thing over and made suggestions. It was very good of Christian to get involved: after all, it was basically my idea to do the paper and he just got “volunteered”.
DW: For the past year, you've also been writing a blog on TeX developments, texdev.net, which regularly contains lots of useful information. It seems you have an urge to document things. How do you come by this urge, what are you trying to do with the blog, and why a blog rather than just writing papers or answering queries on comp.text.tex?
JW: The background to the blog is that I'm on the committee of the UK TeX Users' Group (UK-TUG). Our website had some issues last year, and I ended up as webmaster. So that I had a “testbed” for things, I decided to set up my own site using the same setup as UK-TUG. So the original idea was actually more about supporting something else entirely! At the same time, giving users an insight into what is happening with software is something that developers both big and small are increasingly using. So I thought it might be a good idea to explain what I'm doing with things like siunitx and achemso (a bundle for submissions to the American Chemical Society).
I do try to do a good job on documentation. Most new TeX users find that there is a lot to learn, and it's not always easy to find what you need. Usenet or forum posts tend to be quite focussed on the “matter in hand”, or wander so far that most people can't follow the thread. In my blog, I can take one idea and try to develop it systematically. Of course, that doesn't mean that things like TUGboat articles are off my “to do” list (I've got some ideas in mind at the moment). I'd not regard the different methods of writing as mutually-exclusive.
One thing I can do on the blog is pick “highlights” from where ever I see them. So I'm on most of the TeX-related mailing lists, read c.t.t and The LaTeX Community forums, plus I get quite a lot of TeX-related email. A lot of the posts are based on something I've seen elsewhere, and want to talk about or simply highlight
DW: The LaTeX3 team appears to have noticed your capability for getting things written and had you join their team. Please say a bit more about your involvement with the LaTeX3 team and writing the LaTeX3 overview article in the aforementioned issue of TUGboat. Also, I'd appreciate your insight on the motivation of the LaTeX3 team to keep pushing in the direction of a replacement for LaTeX2e after so many years and whether a production system will in fact result at some point, or is the LaTeX3 activity mostly about experimenting with possibilities.
JW: I started looking at LaTeX3 seriously after Will Robertson dropped me a line to see what I thought of it (he'd joined the team not long before). It took a while to understand things (the syntax is very different from LaTeX2e or plain TeX). After a bit of work understanding the ideas, I wrote an experimental package using the “expl3” code (the base programming layer of LaTeX3) available at the time. I felt that there was a need for a “lead in” article to provide an entry route for new LaTeX3 programmers, so I sat down and wrote one.
That coincided with a refactor of the expl3 code. The idea of the refactor was to get to a point where expl3 could be regarded as stable, and so use it for real work rather than just testing out ideas.
After I'd submitted the article to TUGboat, Frank Mittelbach asked me to join the LaTeX3 team: I guess I had been asking questions about the right things. The refactor was still going on, so I made some edits to the article as the concepts themselves changed. However, the bulk of the article is still what I wrote before I joined the team. It's still an outsider's view, really, because almost all of the expl3 ideas are not things I've actually worked on. The things I have done for the team are almost all not covered by the introduction article. So I'm now working on some new articles which do cover things I've been involved in coding.
There is a real need to think beyond LaTeX2e. My focus tends to be on end users, but there are also serious issues for publishers with LaTeX2e. A simple example is that almost every document needs to include quite a number of support packages (fontenc, geometry, font packages, …) to get even the basics done. If LaTeX is to continue to attract new users, there is a need to make things easier. Once you start thinking about moving beyond LaTeX2e, then you can't ignore the more fundamental questions about the way the kernel currently works and how you separate design and content out in a more consistent way.
A lot of the “historical” LaTeX3 stuff has been about experimentation. That partly reflects the needs and availability of members of the team over the years. We're now looking to actually move code to a usable state. The expl3 base layer is now stable (things will be added, but not taken away or altered without very good reason). Current work on the next layer up (the xparse and template packages) is going well, and I'd hope we'll soon be able to move on to other issues.
Part of the reason we're making some progress is that there is a group of “active” workers on the LaTeX3 code (Frank Mittelbach, Will Robertson, Morten Høgholm and me). Code gets written when people have the time and energy (we all have work to do), and so you need an active group to keep the motivation going. Of course, other members of the team are providing useful input (there is a very wide range of experience and ideas).
I'm very focussed on things that are going to be useful. So I see the ongoing LaTeX3 work as progress toward a successor to LaTeX2e. Of course, quite what it will look like is not certain, but to get there we need to do the basics. Once that is done, we can start to talk about what LaTeX3 should deliver for the user. I'd hope that means something which looks like LaTeX2e (or rather can use that syntax amongst others), but deals with fonts, input, page layout and so on in a much clearer and more flexible way.
DW: You mention you are studying Italian. Any chance you'll dive into improving LaTeX support for Italian?
JW: You never know, but at the moment I'm more likely to leave it to people working on things like polyglossia. With my LaTeX team “hat” on, multi-lingual support is obviously on the list for user level LaTeX3. At the moment, polyglossia is showing the way forward, at least with XeTeX. babel is a complex beast, as many people know, and I'd be afraid to go anywhere near it without a lot of thought!
DW: It has really been fun talking to a relative newcomer to the TeX world who has brought fresh insight and energy to the community.
JW: Not a problem: thanks for the opportunity.