Running Jekyll with Docker and OpenShift

This post appeared originally in our sysadvent series and has been moved here following the discontinuation of the sysadvent microsite

OpenShift is currently en vogue in the company. The ease of use and scalability found in a container based system allows us to automate the build and deployment steps of containers through software like Kubernetes/OpenShift.

Jekyll

We have visited Jekyll in several previous blog posts. Our techblog and the SysAdvent calendar (decomissioned as per 2023) utilize the Jekyll software to produce a static site from markdown content. The content and templates is stored in Git repositories on an internal GitLab server.

Again, it should be noted that GitHub Pages not only supports Jekyll, but also lets you configure custom domain. So why not just use GitHub Pages? The short answer is SEO and because we can. So there it is.

GitLab CI and S3

In previous incarnations, we have used the GitLab CI integration tools in order to build Jekyll and subsequently populate a website bucket in our Ceph storage cluster. The website bucket then acted as a distributed backend for the Varnish frontend cache layer. This is a simple, fault tolerant setup that scales well.

In this post we’ll show how to build a Docker container image that includes a small web server and the document tree structure. This allows us to tune the web server with better cache-headers and run server-side processes like mail-submission scripts etc. Additionally, the statelessness of a Jekyll-generated site plays to the strengths of Docker containers. Need to scale? Add more containers. Wrote a new blog post? Throw away the old container(s). Finally, the container can run in a Kubernetes/OpenShift setting for high availability and scaling purposes.

Docker Container

The specifications for building a Docker image are given in a text document called Dockerfile. Giving a complete Docker introduction is out of scope for this blog entry, but in our case the build steps would be to:

define the container starting point using the “FROM” instruction. Usually a RHEL or Debian derivative. An interesting alternative here is Alpine Linux. Take a look at the Everyday Docker blog entry for an example of using Alpine Linux.
list the install and subsequent build steps needed to get a Jekyll installation up and running through the “RUN” instructions
include “ADD” instructions for copying files or directories into the docker image.
add generic instructions as “LABEL”, “EXPOSE” and “CMD”. Remember that the “containerized” process should run in the foreground

A rudimentary Dockerfile for building a Jekyll based sysadvent docker image would look like this:

FROM ubuntu:latest
ENV NAME sysadvent

# Add multiverse repo. Install nginx, jekyll and some ruby-packages
RUN \
  sed -i 's/# \(.*multiverse$\)/\1/g' /etc/apt/sources.list && \
  apt-get update && \
  apt-get -y upgrade && \
  apt-get install -y nginx jekyll ruby-bundler ruby-jekyll-feed ruby-jekyll-paginate && \
  rm -rf /var/lib/apt/lists/* && \
  rm -rf /var/www/html/* && \
  sed -i -e 's/listen 80 default/listen 8080 default/' \
         -e 's/listen \[::\]:80 default/listen \[::\]:8080 default/' /etc/nginx/sites-available/default

WORKDIR /build
ADD . /build

# Run Jekyll. Put result in nginx default document root
RUN \
  bundle install --path=vendor && \
  bundle exec jekyll build --destination /var/www/html/sysadvent && \
  apt-get -y remove ruby-dev build-essential && \
  apt -y autoremove && \
  apt clean

# Start nginx in the foreground
EXPOSE 8080
CMD ["nginx", "-g", "daemon off;"]

The above Dockerfile will start with the latest Ubuntu LTS image, install the correct tools, build the site and finally kick off a web server. This Dockerfile presumes a preexisting Jekyll site complete with a Gem-file.

The RUN commands are chained to minimize the number of layers in the resulting docker image. The newer overlayFS takes a far smaller performance hit when using many layers than the older AUFS due to intrinsic smarter internal presentation of lower layers and the caching of file lookups. I still think it is a good practice to minimize the number of layers in an image, but not as long as it seriously affects the readability of the Dockerfile.

You can build and run a docker image based on the above Dockerfile with ‘docker build’ and ‘docker run’. The following commands will build an image called Jekyll and start a docker process with this image while mapping the docker port 8080 to the host OS’ port 4000:

docker build -t sysadvent .
docker run -d -p 4000:8080 sysadvent

As an aside, the resulting docker image can with luck be slimmed down by e.g. removing redundant apt packages. A smaller image will launch faster and consume less resources. A good starting point would be to use a better initial image.

OpenShift

OpenShift is a container platform that builds on Docker for container-technology and the Kubernetes for orchestration of those containers. OpenShift solves some of the network annoyances in Kubernetes, adds security through liberal use of SELinux and adds features like source-to-image (S2I) and authentication.

We run several OpenShift installations so in this case, we can utilize existing infrastructure for our Jekyll container.

Configuration

An OpenShift pod can be configured using the OpenShift CLI - oc. This is a tool that connects to the OpenShift API and gives you

at the time of writing - a superset of the OpenShift web GUI functionality. If OpenShift-cli is not in your distro repository, you can download it from GitHub

Initial steps are logging in and choosing a project. At this point, I both have a login, and a project already exists

oc login https://openshift.redpill-linpro.com
oc project sysadvent

Give OpenShift access to the repository

In order to build the site, OpenShift needs to clone the repository from our non-public GitLab instance. For this to work with SSH, you need to provide GitLab with a private SSH key. Since you never want to part with your personal private SSH key, we create a dedicated key set for this operation:

ssh-keygen -f gitlab-sysadvent -N ''

Then, add the generated public key to the GitLab repository in Settings->Repository->Deploy Keys. Do not check the “Write access allowed” checkbox.

Now we add the private key to OpenShift, link the key to the builder account and finally annotate the secret with our repository:

$ oc secrets new-sshauth gitlab-access --ssh-privatekey=gitlab-sysadvent
$ oc secrets link builder gitlab-access
$ oc annotate secret/gitlab-access 'build.openshift.io/source-secret-match-uri-1=ssh://gitlab.redpill-linpro.com/rl/sysadvent.git'
secret "gitlab-access" annotated

Modify the Dockerfile

OpenShift does not run docker containers with root-permissions. In fact, OpenShift Origin runs containers with an arbitrarily assigned user id. Given that Nginx wants to not only run as root, but also write stuff in /var/cache/nginx and /var/lib/nginx, we have to change file and directory permissions to reflect the above constraints.

The previously defined Dockerfile should now also include the following:

# Configure nginx to log to stdout and handle it running as non-root user-id
RUN \
  ln -sf /dev/stdout /var/log/nginx/access.log && \
  ln -sf /dev/stderr /var/log/nginx/error.log && \
  mkdir -p /var/cache/nginx /var/lib/nginx /var/log/nginx && \
  chgrp -R 0 /var/cache/nginx /var/lib/nginx /var/log/nginx && \
  chmod -R g=u /var/cache/nginx /var/lib/nginx /var/log/nginx && \
  sed -i 's,/run/nginx.pid,/var/lib/nginx/nginx.pid,g' /etc/nginx/nginx.conf && \
  sed -i -e '/^user/d' /etc/nginx/nginx.conf

Building an image

Creating a new “application” is done through the new-app command

$ oc new-app git@gitlab.redpill-linpro.com:rl/sysadvent.git --name sysadvent
--> Found Docker image 20c44cd (10 days old) from Docker Hub for "ubuntu:16.04"

    * An image stream will be created as "ubuntu:16.04" that will track the source image
    * A Docker build using source code from ssh://git@gitlab.redpill-linpro.com/rl/sysadvent.git will be created
      * The resulting image will be pushed to image stream "sysadvent:latest"
      * Every time "ubuntu:16.04" changes a new build will be triggered
      * WARNING: this source repository may require credentials.
                 Create a secret with your Git credentials and use 'set build-secret' to assign it to the build config.
    * This image will be deployed in deployment config "sysadvent"
    * Port 80 will be load balanced by service "sysadvent"
      * Other containers can access this service through the hostname "sysadvent"
    * WARNING: Image "ubuntu:16.04" runs as the 'root' user which may not be permitted by your cluster administrator

--> Creating resources ...
    imagestream "ubuntu" created
    imagestream "sysadvent" created
    buildconfig "sysadvent" created
    deploymentconfig "sysadvent" created
    service "sysadvent" created
--> Success
    Build scheduled, use 'oc logs -f bc/sysadvent' to track its progress.
    Run 'oc status' to view your app.

Here, OpenShift starts off by creating a build configuration. This consists of cloning the given Git project in order to determine the correct build strategy. If, as in our case, a Dockerfile is present, OpenShift will choose the docker strategy. This entails building a docker image from the Dockerfile.

The details in choosing build is documented in the build documentation and described in the Getting started with OpenShift blog entry.

The resulting docker image will be stored in the internal OpenShift Container Registry. You can follow the progress by running:

oc get pods --watch

Finally, you can expose the load-balanced service:

oc expose service sysadvent

Health checks

Liveness and readiness probes are used for restarting pods and for checking if new containers function properly before leading traffic to them:

A Liveness probe checks if a container is actually running. A failed Liveness check will usually result in the container being killed.
A readiness probe checks if the container functions properly and can receive requests. A failed readiness probe results in OpenShift halting traffic to the container.

In our case, we choose to probe the same endpoint:

oc set probe dc/sysadvent --liveness --get-url=http://:8080/sysadvent/index.html
oc set probe dc/sysadvent --readiness --get-url=http://:8080/sysadvent/index.html

Triggering a build from GitLab

The OpenShift API also includes a URL for triggering a build for a specific build configuration. This URL is given by running:

oc describe bc sysadvent

This URL can typically called from GitHub/GitLab. Here, you can use the ‘Settings’->’Integrations’->’Webhooks’ settings.

Results

The result is as usual in production - this sysadvent blog is now served from OpenShift.

Next steps

At this point we have a working site served from OpenShift. A natural next step would be to use S2I for building the docker image. This will be discussed in an upcoming blog entry.

References

Update

Updated links to docker documentation.