LaTeX is a vital component of my work, and usually it is my go-to tool whenever I need to typeset any non-trivial document. TeX is a de facto standard tool for scientific typesetting, and in technical fields most publications, books, and reports are usually composed with it, most often using the LaTeX typesetting system.

Collaborative tools such as Overleaf allow to easily share a LaTeX codebase offering multi-user editing capability, either through an interactive web interface or through git, allowing to build as PDF and to see a live preview of the document as-you-type.

Having a fast machine at hands makes however local builds faster than remote ones, not even considering the need for an internet connection in order to use the second approach. Moreover, any power user has probably dexterity with a favourite text editor, such as Vim, Emacs, or VSCode, that is unlikely to be pleased by the features offered within a web based editor. Thankfully for power users, Overleaf offers git access, allowing to pull and push the code of a project and edit it locally. However, I like to have more control over the build and I am used to keep my code on GitHub, so I decided to arrange a setup that allows to automatically perform a continuous integration pipeline, including the production of PDF output.

CI environment

Many continuous integration services offer good integration with GitHub, such as Travis CI, Travis, AppVeyor, or CicleCI, and they also offer free builds for open-source projects. Another option is the recently introduced in-house CI service provided by GitHub, named GitHub Actions. All of them are suitable candidates for the job, and the choice of tool is mostly a matter of taste. I already mentioned GitHub Actions in a previous meta-post on the migration of the website, and in this occasion I opted for it as well.

Having a decent amount of experience with Travis and AppVeyor, that I both used in past within different projects with good satisfaction, I recently decided to further diversify my competence and get familiar with the newer CI solution provided by GitHub. The concept of a Marketplace for Actions, i.e. canned and reusable build steps, is interesting and has potential to boost code reuse withing the CI while at the same time simplifying the job for the developers, and I expect this to become more and more evident as the Marketplace grows with more (and more versatile) actions. From a more philosophical point of view, all technical aspects aside, keeping the code development workflow within a single provider and not having to rely on an additional service has its pros and cons, but I overall like simplicity, therefore I am tempted by such approach.

Setting up a LaTeX environment, or Docker to the rescue

The goal of this setup is to have automated linting of the LaTeX code and build of a PDF for each commit, and automatic creation of a PDF artifact for each release. In order to achieve that, the first obvious step is to set-up a LaTeX toolchain. Since I use TeX Live on my machines, I decided to stick to it for consistency. GitHub Actions offers Linux Ubuntu workers, so it would be possible, in principle, to write a build step to prepare the environment by installing all the required TeX Live packages from the Ubuntu repositories.

I really like containers, however, so I decided to opt for a Docker image shipping the required environment instead. This allows to use exactly the same environment both on my local machine and the CI worker, saving a lot of headaches that can arise from even slightly different environments. On top of that, it has the bonus of making cloud builds slightly faster by not requiring to download and install a whole bunch of .deb packages on the worker every time a build is performed, but rather just downloading the Docker image.

Setting up a Docker image would be fairly straightforward, but it is not even necessary, since several well written and maintained images already exist on Docker Hub. One example is adnrv/texlive, that comes in different flavours, shipping a minimal image and some larger environments, up to a full TeX Live installation. While writing an image tailored to the project with only the TeX Live components strictly required for the build is a bit more efficient, since it allows to have a smaller image, I keep a full image around on my machine since it allows to build all my projects and avoids having multiple images around. This approach has pros and cons, but overall I find it to be a reasonably convenient way to go.

Setting up the build

Since I want to run the same Dockerised build both on my machine and on the CI workers, a simple solution is to implement the build commands using a build system that can be triggered both from a local shell and from the CI workflow configuration file. Being a reasonably simple task, an easy solution is to use GNU Make, but the choice is really a matter of taste and other build systems can work just fine.

Playing around with pattern rules it is possible to write a generic build step that works for projects with multiple targets, and that can be easily re-used as it is on several projects.

$(OUTPUTS): build/%.pdf: %.tex bibliography.bib media/*
	rm -rf build
	mkdir -p build
	docker run \
		--rm \
		-v "$$(pwd):/mnt" \
		-w /mnt \
		adnrv/texlive:full \
		bash -x -o pipefail -c '\
			pdflatex $(OPTIONS) $< && \
			bibtex "${BUILD_DIR}"/$* && \
			makeglossaries -d $(BUILD_DIR) "$*" && \
			pdflatex $(OPTIONS) $< && \
			pdflatex $(OPTIONS) $< \
			'

Storing the output in a build folder is an easy way to keep the working tree clean, and a simple substitution allows to easily create valid Make patterns out of a list of filenames, listed without extension for convenience since some of the tools require to omit the extension in the commands. For instance, given a project with two documents whose main files are document1.tex and document2.tex, the following couple of Make variables generate the relevant build patterns:

FILES = document1 document2
OUTPUTS = $(patsubst %,build/%.pdf,$(FILES))

An all rule makes it easy to build the whole project:

.PHONY: all
all: $(OUTPUTS)

Another simple rule allows to run chktex to lint our LaTeX code. In this case, find is used to run it on all source files. Moreover, the exit status of chktex is zero even when warnings are emitted, so we check if there is any output from it in order to decide whether the build should be marked as failed.

.PHONY: chktex
chktex:
	find . -maxdepth 1 -type f -name '*.tex' -exec \
		docker run \
			--rm \
			-v "$$(pwd):/mnt" \
			-w /mnt \
			adnrv/texlive:full \
			bash -c "chktex {} | tee build/chktex && [ $$(cat build/chktex | wc -l) -le 0 ]" \;

Last, we can write a rule to clean up the working tree.

.PHONY: clean
clean:
	rm -rf ${BUILD_DIR}

With such a Makefile in place, it is now possible to build the PDF or to lint the code by running in a shell the following couple of commands respectively:

make all
make chktex

Implementing the CI workflow

Once the build system is ready, implementing a CI workflow is really straightforward, since all the more involved bits are encapsulated in the Makefile. We are going to have two separate workflows, one to run build and checks on each commit push, and one to build and upload the artifacts on each release.

The build workflow has two jobs, one for linting and one for building. We can use a couple of Actions to check out the sources and upload the build artifacts. This will produce a zip archive with the PDF and the logs, that can be easily accessed from the web interface and whose retention time is 90 days by default.

name: Build

on: push

jobs:
  chktex:
    runs-on: ubuntu-latest
    steps:
    - name: ChkTeX
      uses: actions/checkout@v1
    - name: Build
      run: |
        make chktex

  pdflatex:
    runs-on: ubuntu-latest
    steps:
    - name: Checkout
      uses: actions/checkout@v1
    - name: Build
      run: |
        make all
    - name: Upload artifacts
      uses: actions/upload-artifact@v1
      with:
        name: build-artifacts
        path: build

The release workflow is equally easy. The upload-to-release Action allows to add the PDF to the release artifacts. Note that in the checkout action it not necessary to explicitly set a ref for the checkout: the release event will set the $GITHUB_REF variable, used by the action to determine what to check out, to the tag of the release triggering the workflow.

on:
  release:
    types: [published]

name: Upload PDF on release

jobs:
  build:
    name: Build and upload PDF
    runs-on: ubuntu-latest
    steps:
      - name: Checkout
        uses: actions/checkout@master
      - name: Build
        run: |
          make all
      - name: Upload Release Asset
        uses: JasonEtco/upload-to-release@d648f1babf77
        with:
          args: ./build/report.pdf application/pdf
        env:
        GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}

Source code

An example of this workflow is publicly available on GitHub. Enjoy!