Spreadsheets as a data source for LaTeX

The problem

In the process of organising ICANN 2016, I made the pdf version of the conference programme, using LaTeX, of course 🙂

The programme includes the list of all the ~170 talks and posters, including authors, divided in the different conference sessions. All these data were prepared by the other organisers in a spreadsheet (an xlsx Excel file, for the records) and the way they had proceeded in previous editions of the conference was to copy paste all these details in the LaTeX file, hoping not to make many mistakes. One problem with this is that I hate copy-pasting 🙂 It is tedious, and every time I do this operation, there is a (small) chance that I will make a mistake. If I have to copy-paste more than 400 times, the chances increase a lot. Also, it happens that we have to change something in the Excel table on the way, and with the copy-paste approach we should always remember to propagate the changes in the LaTeX file. These thoughts led me to the question:

Can we use a spreadsheet as a data source for an LaTeX file, including automatically the data of the relevant cells in the appropriate places in the LaTeX code?

This would solve both the issues above and automate our work! Surprisingly, I found only little help online, in the form of a perl script on the LaTeX stack exchange. I wanted something more flexible, so I decided to make my own script (in Python) to take care of the task. It works with Excel files (both xls and xlsx).

The code

Repo: https://bitbucket.org/pmasulli/spreadsheet_to_latex

I decided to use xlrd to read the spreadsheet. It seems quite solid, copes well will cells containing the result of formulae and its recent versions read both xls and xlsx.

The code above is a very preliminary version put together in an hour or so. It should definitely be improved (among the other things, I would like to support Libreoffice spreadsheets as well!).

 

Usage

Assume that you have your data contained in the file data.xslx and that you want to include those data in document.tex. Then prepare a LaTeX template where, instead of the data, you include tags of the form:

where SheetName is the name of the sheet in the Excel file and A2 and B3 the coordinates of some cell. Note that you can specify a default sheet name in the Python script at line 8 and then just use tags of the form:

Example

data.xlsx
data.xlsx

 

Once you have the template, open a terminal window (how this is done depends on your operating system — google is your friend!) and then move to the directory where you saved the script:

Afterwards, just run the script to generate the LaTeX file including the data from the spreadsheet:

This will generate the LaTeX file document_template.txt.tex, which you can just feed to pdflatex:

The resulting pdf file
The resulting pdf file

6 thoughts on “Spreadsheets as a data source for LaTeX

  1. Hi,
    I have a question: where I have to write ” $ python spreadsheet_to_latex.py data.xlsx document_template.txt”? In python?

    Thanks

    1. Hi Pedro, I’m answering here as well so that the information can be useful for other people, too.
      You should write that command in a terminal, after you cd to the directory where the script is located.
      Best,
      Paolo

      1. Hy Paolo,
        could you explain better what’s a terminal? Where can i put the commands “$ python spreadsheet_to_latex.py data.xlsx document_template.txt” and after “$ pdflatex document_template.txt.tex”?
        Thank you very much.

        1. Hi Giuseppe,

          Depending on your operating system, the terminal can have different names:
          – On Windows, it’s often called “Command Prompt”. Here you can find a brief introduction to its use: http://www.cs.princeton.edu/courses/archive/spr05/cos126/cmd-prompt.html
          – On Mac OSX, just search for Terminal among the applications. Here you have more information about the terminal: http://blog.teamtreehouse.com/introduction-to-the-mac-os-x-command-line
          – If you use GNU/Linux or other UNIX systems, probably you have encountered it before, or just google it 🙂

          For the two commands you mention, you should NOT type the initial dollar sign — it just represents the command prompt on many UNIX platforms.

          In case of further questions, just let me know 🙂

  2. Hi Paolo,
    thank you for your reply!
    I don’t understand how to use these commands on Windows command prompt on that online page.
    However, I try to digit these commands on prompt without $ but this is the result:
    “python: can’t open file ‘spreadsheet_to_latex.py’: [Errno 2] No such file or directory”
    Could you explain step by step what i have to do?
    Thank you very much!!

    1. Hi Giuseppe,

      That command is meant to be given after you move with the terminal to the directory where the script is located. The command “cd” is used to change directory.
      For instance, if you placed the Python script in “C:\script”, then you would first need to give the following command in the terminal:
      cd C:\script
      followed by Return, and afterwards
      python spreadsheet_to_latex.py data.xlsx document_template.txt

      Let me know if you still have problems. You are also welcome to send me an e-mail.
      Cheers!

Leave a Reply

Your email address will not be published. Required fields are marked *

CAPTCHA: please solve the following equation *