selfhosted-apps-docker/prometheus_grafana
DoTheEvolution 361d971966 update
2020-05-02 22:48:23 +02:00
..
dashboards update 2020-04-27 23:43:14 +02:00
readme.md update 2020-05-02 22:48:23 +02:00

Prometheus+Grafana in docker

guide by example

logo

Purpose

Monitoring of the host and the running cointaners.

Everything here is based on the magnificent stefanprodan/dockprom

Files and directory structure

/home/
└── ~/
    └── docker/
        └── prometheus/
            │ 
            ├── grafana/
            │   └── provisioning/
            │       ├── dashboards/
            │       │   ├── dashboard.yml            
            │       │   ├── docker_host.json
            │       │   ├── docker_containers.json
            │       │   └── monitor_services.json
            │       │
            │       └── datasources/ 
            │           └── datasource.yml
            │
            ├── grafana-data/
            ├── prometheus-data/
            │
            ├── .env
            ├── docker-compose.yml
            └── prometheus.yml

docker-compose

Four containers to spin up.
While the illustrious stefanprodan/dockprom also got alertmanager and pushgateway, this is a simpler setup for now.

  • Prometheus - monitoring system that pulls and stores data from exporters and then exposes them for visualization. Can also alert if a metric fails preset rule.
  • Grafana - web based visualization of the collected metrics in nice graphs, gauges, tables,...
  • NodeExporter - exporter for linux machines, in this case gathering docker host metrics, like uptime, cpu load, memory use, network bandwidth use, disk space,...
  • cAdvisor - exporter for gathering docker containers metrics, showing cpu, memory, network use of each container

docker-compose.yml

version: '3'
services:

  # MONITORING SYSTEM AND THE METRICS DATABASE
  prometheus:
    image: prom/prometheus
    container_name: prometheus
    hostname: prometheus
    restart: unless-stopped
    user: root
    depends_on:
      - cadvisor
    command:
      - '--config.file=/etc/prometheus/prometheus.yml'
      - '--storage.tsdb.path=/prometheus'
      - '--storage.tsdb.retention.time=200h'
      - '--web.console.libraries=/etc/prometheus/console_libraries'
      - '--web.console.templates=/etc/prometheus/consoles'
      - '--web.enable-lifecycle'
    volumes:
      - ./prometheus.yml:/etc/prometheus/prometheus.yml
      - ./prometheus_data:/prometheus
    labels:
      org.label-schema.group: "monitoring"

  # WEB BASED UI VISUALISATION OF THE METRICS
  grafana:
    image: grafana/grafana
    container_name: grafana
    hostname: grafana
    restart: unless-stopped
    user: root
    environment:
      - GF_SECURITY_ADMIN_USER
      - GF_SECURITY_ADMIN_PASSWORD
      - GF_USERS_ALLOW_SIGN_UP
    volumes:
      - ./grafana_data:/var/lib/grafana
      - ./grafana/provisioning:/etc/grafana/provisioning
    labels:
      org.label-schema.group: "monitoring"

  # HOSTS METRICS COLLECTOR
  nodeexporter:
    image: prom/node-exporter
    container_name: nodeexporter
    hostname: nodeexporter
    restart: unless-stopped
    command:
      - '--path.procfs=/host/proc'
      - '--path.rootfs=/rootfs'
      - '--path.sysfs=/host/sys'
      - '--collector.filesystem.ignored-mount-points=^/(sys|proc|dev|host|etc)($$|/)'
    volumes:
      - /proc:/host/proc:ro
      - /sys:/host/sys:ro
      - /:/rootfs:ro
    labels:
      org.label-schema.group: "monitoring"

  # DOCKER CONTAINERS METRICS COLLECTOR
  cadvisor:
    image: google/cadvisor
    container_name: cadvisor
    hostname: cadvisor
    restart: unless-stopped
    volumes:
      - /:/rootfs:ro
      - /var/run:/var/run:rw
      - /sys:/sys:ro
      - /var/lib/docker:/var/lib/docker:ro
      - /cgroup:/cgroup:ro
    labels:
      org.label-schema.group: "monitoring"

networks:
  default:
    external:
      name: $DEFAULT_NETWORK

.env

# GENERAL
MY_DOMAIN=blabla.org
DEFAULT_NETWORK=caddy_net
TZ=Europe/Bratislava

# GRAFANA
GF_SECURITY_ADMIN_USER=admin
GF_SECURITY_ADMIN_PASSWORD=admin
GF_USERS_ALLOW_SIGN_UP=false

All containers must be on the same network.
If one does not exist yet: docker network create caddy_net

Configuration files

Setup is mostly configured through config files. Some of the grafana config files could be ommited and info passed on the first run, but setting it through GUI wont generate these files which hinders backup.

prometheus.yml

A config file for prometheus, bind mounted in to prometheus container.
This one contains the bare minimum setup of endpoints to be scraped for data.

prometheus.yml

global:
  scrape_interval:     15s
  evaluation_interval: 15s

# A scrape configuration containing exactly one endpoint to scrape.
scrape_configs:
  - job_name: 'nodeexporter'
    scrape_interval: 5s
    static_configs:
      - targets: ['nodeexporter:9100']

  - job_name: 'cadvisor'
    scrape_interval: 5s
    static_configs:
      - targets: ['cadvisor:8080']

  - job_name: 'prometheus'
    scrape_interval: 10s
    static_configs:
      - targets: ['localhost:9090']

datasource.yml

Grafana's datasources config file, from where it suppose to get metrics.
Here it ust points at prometheus container.

datasource.yml

apiVersion: 1

datasources:
  - name: Prometheus
    type: prometheus
    access: proxy
    orgId: 1
    url: http://prometheus:9090
    basicAuth: false
    isDefault: true
    editable: false

dashboard.yml

Config file telling grafana from where to load dashboards.

dashboard.yml

apiVersion: 1

providers:
  - name: 'Prometheus'
    orgId: 1
    folder: ''
    type: file
    disableDeletion: false
    editable: false
    allowUiUpdates: false
    options:
      path: /etc/grafana/provisioning/dashboards

<dashboards>.json

In the dashboards directory of this repository.

Preconfigured dashboards from the prodigious stefanprodan/dockprom.
Mostly unchanged, except for the default time range shown changed from 15min to 1 hour, and a fix for host network monitoring not showing traffick.

  • docker_host.json - dashboard showing linux host metrics
  • docker_containers.json - dashboard showing docker containers metrics, except the ones labeled as monitoring in the compose file
  • monitoring_services.json - dashboar showing docker containers metrics of containers that are labeled monitoring

Reverse proxy

Caddy v2 is used, details here.

The setup is accessed through grafana. But occasionally there might be need to check with prometheus, which will be available on <docker-host-ip>:9090.

Caddy will also need port 9090 published.

Caddyfile

grafana.{$MY_DOMAIN} {
    reverse_proxy grafana:3000
}

:9090 {
    reverse_proxy prometheus:9090
}

Extra info: :9000 is short notation for localhost:9000


interface-pic

Update

  • watchtower updates the image automaticly

  • manual image update
    docker-compose pull
    docker-compose up -d
    docker image prune

Backup and restore

  • backup using borgbackup setup that makes daily snapshot of the entire directory

  • restore
    down the containers docker-compose down
    delete the entire prometheus directory
    from the backup copy back the prometheus directortory
    start the container docker-compose up -d