Writing LaTeX with Python

Let me jump right into it, and give you a (working) template code :

import os,glob,subprocess

header = r'''\documentclass{article}
\begin{document}
'''

footer = r'''\end{document}'''

main = 'I'm writing #LaTeX with Python !'

content = header + main + footer

with open('myfile.tex','w') as f:
     f.write(content)

commandLine = subprocess.Popen(['pdflatex', 'myfile.tex'])
commandLine.communicate()

os.unlink('myfile.aux')
os.unlink('myfile.log')
os.unkink('myfile.tex')

A few important points :

  • r''' indicates a raw string, i.e. preserves most (not all, as we shall see) of what you type afterwards, including the line breaks necessary for $\LaTeX$.

  • The communicate() method of the subprocess module pipes the stdout of pdflatex to the terminal - i.e. you can see what’s going wrong with your $\LaTeX$ formatting.

  • Removing all the unnecessary files (except of course the PDF!) is done via os.unlink().

The question now is, why the hell would you go to all this trouble for a $\LaTeX$ file? Automatization is the answer.

In fact, I came up with the following script when I had to produce many (really many) tables, nicely formatted in $\LaTeX$, from data files. I won’t go into the full details of the script, but a few key aspects of generating a correct .tex file are at work here. Importantly, you want to make sure you correctly “escape the escape codes”. Here you’ll find a list of these special sequences; to type a \begin{ in a regular string (not a raw one), you’ll therefore need to double the backslash.

import os,glob,subprocess

dictOfSels = {}
for fname in glob.glob("*.txt"):
    with open(fname) as f:
        next(f)
        sel = fname.split("_yield")[0]
        dictOfSels[sel] = {}
        for line in f:
            line = line.rstrip()
            dictOfSels[sel][line.split(",")[0]] = [float(line.split(",")[1]),float(line.split(",")[2])]

header = r'''\documentclass{article}
\begin{document}
\begin{center}
'''
footer = r'''\end{center}
\end{document}
'''

main = ''

for sel in sorted(dictOfSels.keys()):
    main = main + '\\begin{tabular}{|c|c|}\hline\multicolumn{2}{|c|}{\\textbf{'+sel+r'''}}\\ \hline
'''
    sortedSamples = dictOfSels[sel].keys()
    if "data" in sortedSamples:
        sortedSamples.insert(len(sortedSamples)-2,sortedSamples.pop(1))
    else :
        sortedSamples.insert(len(sortedSamples)-1,sortedSamples.pop(1))
    for sample in sortedSamples:
        if sample == 'MC' or sample == 'data':
            main = main + '\hline '
        main = main + sample + ' & $' + "{0:.2f}".format(dictOfSels[sel][sample][0]) + '\pm ' + "{0:.2f}".format(dictOfSels[sel][sample][1]) + r'''$ \\
'''
    
    main = main + r'''\hline\end{tabular}
\newline
\vspace*{1cm}
\newline
'''

content = header + main + footer

with open('yields.tex','w') as f:
    f.write(content)

commandLine = subprocess.Popen(['pdflatex', 'yields.tex'])
commandLine.communicate()

os.unlink('yields.tex')
os.unlink('yields.log')
os.unlink('yields.aux')

Simply running this script in a directory containing the appropriate data files (3 columns, the first one containing the name of a particular Monte Carlo sample, the second its value and the third the associated error) allows me to create a single PDF file, containing as many tables as there are data files, each labeled by the name (minus the extension) of the data file, and containing (an ordered version of) the samples and their respective values (with a two decimal place precision).

Neat, isn’t it?

result

comments powered by Disqus