8.8 KiB
Prometheus+Grafana in docker
guide by example
Purpose
Monitoring of the host and the running cointaners.
Everything here is based on the magnificent stefanprodan/dockprom
Files and directory structure
/home/
└── ~/
└── docker/
└── prometheus/
│
├── grafana/
│ └── provisioning/
│ ├── dashboards/
│ │ ├── dashboard.yml
│ │ ├── docker_host.json
│ │ ├── docker_containers.json
│ │ └── monitor_services.json
│ │
│ └── datasources/
│ └── datasource.yml
│
├── grafana-data/
├── prometheus-data/
│
├── .env
├── docker-compose.yml
└── prometheus.yml
docker-compose
Four containers to spin up.
While the illustrious stefanprodan/dockprom
also got alertmanager and pushgateway, this is a simpler setup for now.
- Prometheus - monitoring system that pulls and stores data from exporters and then exposes them for visualization. Can also alert if a metric fails preset rule.
- Grafana - web based visualization of the collected metrics in nice graphs, gauges, tables,...
- NodeExporter - exporter for linux machines, in this case gathering docker host metrics, like uptime, cpu load, memory use, network bandwidth use, disk space,...
- cAdvisor - exporter for gathering docker containers metrics, showing cpu, memory, network use of each container
docker-compose.yml
version: '3'
services:
# MONITORING SYSTEM AND THE METRICS DATABASE
prometheus:
image: prom/prometheus
container_name: prometheus
hostname: prometheus
restart: unless-stopped
user: root
depends_on:
- cadvisor
command:
- '--config.file=/etc/prometheus/prometheus.yml'
- '--storage.tsdb.path=/prometheus'
- '--storage.tsdb.retention.time=200h'
- '--web.console.libraries=/etc/prometheus/console_libraries'
- '--web.console.templates=/etc/prometheus/consoles'
- '--web.enable-lifecycle'
volumes:
- ./prometheus.yml:/etc/prometheus/prometheus.yml
- ./prometheus_data:/prometheus
labels:
org.label-schema.group: "monitoring"
# WEB BASED UI VISUALISATION OF THE METRICS
grafana:
image: grafana/grafana
container_name: grafana
hostname: grafana
restart: unless-stopped
user: root
environment:
- GF_SECURITY_ADMIN_USER
- GF_SECURITY_ADMIN_PASSWORD
- GF_USERS_ALLOW_SIGN_UP
volumes:
- ./grafana_data:/var/lib/grafana
- ./grafana/provisioning:/etc/grafana/provisioning
labels:
org.label-schema.group: "monitoring"
# HOSTS METRICS COLLECTOR
nodeexporter:
image: prom/node-exporter
container_name: nodeexporter
hostname: nodeexporter
restart: unless-stopped
command:
- '--path.procfs=/host/proc'
- '--path.rootfs=/rootfs'
- '--path.sysfs=/host/sys'
- '--collector.filesystem.ignored-mount-points=^/(sys|proc|dev|host|etc)($$|/)'
volumes:
- /proc:/host/proc:ro
- /sys:/host/sys:ro
- /:/rootfs:ro
labels:
org.label-schema.group: "monitoring"
# DOCKER CONTAINERS METRICS COLLECTOR
cadvisor:
image: google/cadvisor
container_name: cadvisor
hostname: cadvisor
restart: unless-stopped
volumes:
- /:/rootfs:ro
- /var/run:/var/run:rw
- /sys:/sys:ro
- /var/lib/docker:/var/lib/docker:ro
- /cgroup:/cgroup:ro
labels:
org.label-schema.group: "monitoring"
networks:
default:
external:
name: $DEFAULT_NETWORK
.env
# GENERAL
MY_DOMAIN=blabla.org
DEFAULT_NETWORK=caddy_net
TZ=Europe/Bratislava
# GRAFANA
GF_SECURITY_ADMIN_USER=admin
GF_SECURITY_ADMIN_PASSWORD=admin
GF_USERS_ALLOW_SIGN_UP=false
All containers must be on the same network.
If one does not exist yet: docker network create caddy_net
Configuration files
Setup is mostly configured through config files. Some of the grafana config files could be ommited and info passed on the first run, but setting it through GUI wont generate these files which hinders backup.
prometheus.yml
A config file for prometheus, bind mounted in to prometheus container.
This one contains the bare minimum setup of endpoints to be scraped for data.
prometheus.yml
global:
scrape_interval: 15s
evaluation_interval: 15s
# A scrape configuration containing exactly one endpoint to scrape.
scrape_configs:
- job_name: 'nodeexporter'
scrape_interval: 5s
static_configs:
- targets: ['nodeexporter:9100']
- job_name: 'cadvisor'
scrape_interval: 5s
static_configs:
- targets: ['cadvisor:8080']
- job_name: 'prometheus'
scrape_interval: 10s
static_configs:
- targets: ['localhost:9090']
datasource.yml
- /grafana/provisioning/datasources/datasource.yml
- official documentation
Grafana's datasources config file, from where it suppose to get metrics.
Here it ust points at prometheus container.
datasource.yml
apiVersion: 1
datasources:
- name: Prometheus
type: prometheus
access: proxy
orgId: 1
url: http://prometheus:9090
basicAuth: false
isDefault: true
editable: false
dashboard.yml
- grafana/provisioning/dashboards/dashboard.yml
- official documentation
Config file telling grafana from where to load dashboards.
dashboard.yml
apiVersion: 1
providers:
- name: 'Prometheus'
orgId: 1
folder: ''
type: file
disableDeletion: false
editable: false
allowUiUpdates: false
options:
path: /etc/grafana/provisioning/dashboards
<dashboards>.json
- grafana/provisioning/dashboards/<dashboards.json>
- official documentation
In the dashboards directory of this repository.
Preconfigured dashboards from the prodigious
stefanprodan/dockprom.
Mostly unchanged, except for the default time range shown changed from 15min to 1 hour,
and a fix
for host network monitoring not showing traffick.
- docker_host.json - dashboard showing linux host metrics
- docker_containers.json - dashboard showing docker containers metrics,
except the ones labeled as
monitoring
in the compose file - monitoring_services.json - dashboar showing docker containers metrics
of containers that are labeled
monitoring
Reverse proxy
Caddy v2 is used, details
here.
The setup is accessed through grafana. But occasionally there might be need to check with prometheus, which will be available on <docker-host-ip>:9090.
Caddy will also need port 9090 published.
Caddyfile
grafana.{$MY_DOMAIN} {
reverse_proxy grafana:3000
}
:9090 {
reverse_proxy prometheus:9090
}
Extra info: :9000
is short notation for localhost:9000
Update
-
watchtower updates the image automaticly
-
manual image update
docker-compose pull
docker-compose up -d
docker image prune
Backup and restore
-
backup using borgbackup setup that makes daily snapshot of the entire directory
-
restore
down the containersdocker-compose down
delete the entire prometheus directory
from the backup copy back the prometheus directortory
start the containerdocker-compose up -d