Running traefik as a systemd service
Intro
According to its purveyors, "Traefik is a leading modern open source reverse proxy and ingress controller that makes deploying services and APIs easy." At a previous job, we were migrating our Hashicorp Nomad environment from Fabio to Traefik when I left. I didn't spend much time with it before I left, but I remembered it was fast, simple, and had great Nomad and Consul support (always a plus for this Hashicorp fanboy).
Fast forward a couple of years. As I attempt to build up my feeble SRE skills, I'm learning more about the Four Golden Signals of monitoring. But if I want to create fancy Grafana dashboards from fancy Prometheus metrics, nginx isn't going to cut it (the open-source version doesn't provide site-specific metrics). By contrast, Traefik provides the Prometheus metrics that plants crave .
But we'll get to the monitoring specifics in a future post. For now, let's talk installation!
Installing on baremetal
-
Download the binary from Traefik's github page and install (I typically use
/usr/local/bin/as my installation directory). -
OS-level configuration:
- Create a system group and user
- Set up a systemd tmpfile. Place the following into
/etc/tmpfiles.d/traefik.conf:d /run/traefik 0770, then runsystemd-tmpfiles --create - Create the systemd unitfile and place in
/etc/systemd/system/traefik.service. The below is what I use, it may need slight tweaking for your use case.
[Unit]
Description=Traefik (pronounced traffic) is a modern HTTP reverse proxy and load balancer that makes deploying microservices easy.
After=network.target
[Service]
User=traefik
Group=traefik
ExecReload=/bin/kill -HUP $MAINPID
ExecStart=/usr/local/bin/traefik --config-file=/etc/traefik/traefik.yaml
PIDFile=/run/traefik/traefik.pid
ProtectHome=true
ProtectSystem=full
PrivateTmp=true
CapabilityBoundingSet=CAP_NET_BIND_SERVICE
AmbientCapabilities=CAP_NET_BIND_SERVICE
NoNewPrivileges=True
WorkingDirectory=/var/lib/traefik
ReadWriteDirectories=/var/lib/traefik/plugins-storage
[Install]
WantedBy=default.target
Traefik File Provider (Configuration Discovery)
Traefik has a number of Configuration Discovery Providers. I use Hashicorp Consul in my lab, but I'm going to go over the file provider, since it doesn't require deploying another app and/or running on a public cloud.
Enabling
Put this into your static config:
providers:
file:
# This is my personal config, feel free to
# use whichever directory you want
directory: /etc/traefik/dynamic
watch: true
Once that configuration is active, you can place Traefik config files into that directory and Traefik will apply them immediately, no restart needed. Sounds convenient, right? Of course, but there are a few footguns as well.
Perilous Permissions
Traefik needs write permissions of your dynamic config folder and every file in it. It wants to create inotify watchers on 'em (makes sense for dynamic stuff, right?).
If your permissions are bad, Traefik will log a permissions error, but it also logs a ton of tls: bad certificate errors immediately afterwards.
If your traefik-fronted sites all start serving up the default Traefik certificate, you might have this problem. Make sure you scroll up all the way and check for permissions errors.
Validation? Nah.
Traefik does not validate its file-based discovery config. If you put invalid config in your directory, Traefik will silently ignore it (at least, that was my experience). It seems like a pretty bad user experience, but there is a silver lining...
JSON Schema-based validation
As the Traefik docs mention, the Traefik file provider has a JSON schema. This means you can validate your config and even get helpful error messages about what might be wrong. All you need to do is convert the Traefik config to JSON and compare it to the JSON schema.
Because YAML is a superset of JSON, it's easy to convert from Traefik's YAML config to json. I use yq for this:
yq -p yaml -o json grafana.yaml > grafana.json
Sample Traefik dynamic configfile. Can you spot the mistake?
http:
routers:
grafana:
entryPoints:
- https
rule: Host(`grafana.example.com`)
service: grafana@internal
tls: true
services:
grafana:
loadBalancer:
- url: http://127.0.0.1:3000
tls:
bad: example
certificates:
- certFile: /etc/ssl/certs/grafana.example.com.crt
keyFile: /etc/ssl/private/grafana.example.com.key
Validation with python script
Here's a simple validation script courtesy of ChatGPT:
#!/usr/bin/env python3
import json
import sys
from jsonschema import validate, ValidationError
def main():
schema_file, json_file = sys.argv[1], sys.argv[2]
with open(schema_file, 'r') as f:
schema = json.load(f)
with open(json_file, 'r') as f:
data = json.load(f)
try:
validate(instance=data, schema=schema)
print("JSON data is valid against the schema")
except ValidationError as e:
print("JSON data is invalid:", e)
sys.exit(1)
if __name__ == "__main__":
main()
After that, you can run the script to validate the data:
./validate-json.py traefik-v2-file-provider.json grafana.json
If your syntax is wrong, you will probably get an error such as (edited for length):
JSON data is invalid: Additional properties are not allowed ('bad' was unexpected)
[...]
Failed validating 'additionalProperties' in schema['properties']['tls']: On instance['tls']:
{'bad': 'example',
'certificates': [{'certFile': '/etc/ssl/certs/grafana.example.com.crt',
'keyFile': '/etc/ssl/private/grafana.example.com.key'}]}
You can also validate the json schema with ansible's utils.validate:
- name: validate grafana config v2
tags: ['validate']
delegate_to: localhost
become: no
run_once: yes
block:
- name: validate grafana json
ansible.utils.validate:
data: "{{ lookup('ansible.builtin.file', '../files/grafana.json') | from_json }}"
criteria: "{{ lookup('ansible.builtin.file', '../files/traefik-v2-file-provider.json') | from_json }}"
engine: ansible.utils.jsonschema
register: result
Unfortunately, I haven't figured out how to get the helpful error messages that the Python script provides.
In Conclusion
Now that Traefik is running, it's time to take a look at the docs and see what cool stuff is available. Stay tuned for my next post about using Traefik's Prometheus metrics to create a "Four Golden Signals" Grafana dashboard!