Spreadsheets as a data source for LaTeX

The problem

In the process of organising ICANN 2016, I made the pdf version of the conference programme, using LaTeX, of course 🙂

The programme includes the list of all the ~170 talks and posters, including authors, divided in the different conference sessions. All these data were prepared by the other organisers in a spreadsheet (an xlsx Excel file, for the records) and the way they had proceeded in previous editions of the conference was to copy paste all these details in the LaTeX file, hoping not to make many mistakes. One problem with this is that I hate copy-pasting 🙂 It is tedious, and every time I do this operation, there is a (small) chance that I will make a mistake. If I have to copy-paste more than 400 times, the chances increase a lot. Also, it happens that we have to change something in the Excel table on the way, and with the copy-paste approach we should always remember to propagate the changes in the LaTeX file. These thoughts led me to the question:

Can we use a spreadsheet as a data source for an LaTeX file, including automatically the data of the relevant cells in the appropriate places in the LaTeX code?

This would solve both the issues above and automate our work! Surprisingly, I found only little help online, in the form of a perl script on the LaTeX stack exchange. I wanted something more flexible, so I decided to make my own script (in Python) to take care of the task. It works with Excel files (both xls and xlsx).

The code

Repo: https://github.com/pmasulli/spreadsheet_to_LaTeX

I decided to use xlrd to read the spreadsheet. It seems quite solid, copes well will cells containing the result of formulae and its recent versions read both xls and xlsx.

The code above is a very preliminary version put together in an hour or so. It should definitely be improved (among the other things, I would like to support Libreoffice spreadsheets as well!).

 

Usage

Assume that you have your data contained in the file data.xslx and that you want to include those data in document.tex. Then prepare a LaTeX template where, instead of the data, you include tags of the form:

where SheetName is the name of the sheet in the Excel file and A2 and B3 the coordinates of some cell. Note that you can specify a default sheet name in the Python script at line 8 and then just use tags of the form:

Example

data.xlsx
data.xlsx

 

Once you have the template, open a terminal window (how this is done depends on your operating system — google is your friend!) and then move to the directory where you saved the script:

Afterwards, just run the script to generate the LaTeX file including the data from the spreadsheet:

This will generate the LaTeX file document_template.txt.tex, which you can just feed to pdflatex:

The resulting pdf file
The resulting pdf file