pull/35/head
DoTheEvo 1 year ago
parent 3659b54693
commit 094e238643

@ -2,7 +2,7 @@
###### guide-by-example
![logo](https://i.imgur.com/e03aF8d.png)
![logo](https://i.imgur.com/q41QfyI.png)
WORK IN PROGRESS<br>
Loki and caddy monitoring parts are not finished yet
@ -11,20 +11,26 @@ Loki and caddy monitoring parts are not finished yet
Monitoring of the host and the running cointaners.
* [Official Prometheus site](https://prometheus.io/)
* [Official Grafana site](https://grafana.com/)
* [Official Prometheus](https://prometheus.io/)
* [Official Grafana](https://grafana.com/)
* [Official Loki](https://grafana.com/oss/loki/)
Most of the stuff here is based off the magnificent
[stefanprodan/dockprom.](https://github.com/stefanprodan/dockprom)</br>
So maybe just go play with that.
Monitoring in this case means gathering and showing information on how services
or machines or containers are running. Can be cpu, io, ram, disk use...
can be number of http requests,or results of backups.<br>
Prometheus deals with metrics. Loki deals with logs. Grafana is there to show
the data on a dashboard.
Lot of the prometheus stuff here is based off the magnificent
[stefanprodan/dockprom.](https://github.com/stefanprodan/dockprom)
# Chapters
* **[Core prometheus+grafana](#Overview)** - to get nice dashboards with metrics from docker host and containers
* **[Pushgateway](#Pushgateway)** - how to use it to allow pushing metrics in to prometheus from anywhere
* **[Alertmanager](#Alertmanager)** - how to use it for notifications
* **[Loki](#Loki)** - how to do the above things but for logs, not just metrics
* **[Caddy monitoring](#Caddy_monitoring)** - adding dashboard for reverse proxy info
* **[Core prometheus+grafana](#Overview)** - nice dashboards with metrics of docker host and containers
* **[Pushgateway](#Pushgateway)** - push data to prometheus from anywhere
* **[Alertmanager](#Alertmanager)** - setting alerts and getting notifications
* **[Loki](#Loki)** - all of the above but for log files
* **[Caddy monitoring](#Caddy_monitoring)** - monitoring a reverse proxy
# Overview
@ -32,7 +38,7 @@ So maybe just go play with that.
Prometheus is an open source system for monitoring and alerting,
written in golang.<br>
It periodicly collects metrics from configured targets,
It periodically collects metrics from configured targets,
makes these metrics available for visualization, and can trigger alerts.<br>
Prometheus is relatively young project, it is a **pull type** monitoring.
@ -49,12 +55,11 @@ Prometheus is relatively young project, it is a **pull type** monitoring.
and exposes them at an endpoint so they can be pulled
* **Alertmanager** - responsible for handling alerts from Prometheus Server,
and sending notifications through email, slack, pushover,..
In this setup [ntfy](https://github.com/DoTheEvo/selfhosted-apps-docker/tree/master/gotify-ntfy-signal) webhook will be used.<br>
Grafana comes with own alerts, but grafana kinda feels... b-tier
* **pushgateway** - allows push type of monitoring.
Should not be overused as it goes against the pull philosophy of prometheus.
Most commonly it is used to collect data from batch jobs, or from services
that have short execution time. Like a backup script.<br>
**In this setup [ntfy](https://github.com/DoTheEvo/selfhosted-apps-docker/tree/master/gotify-ntfy-signal)
webhook will be used.**
* **pushgateway** - allows push type of monitoring. Meaning a machine anywhere
in the world can push data in to your prometheus. Should not be overused
as it goes against the pull philosophy of prometheus.
* **Grafana** - for web UI visualization of the collected metrics
@ -67,11 +72,11 @@ Prometheus is relatively young project, it is a **pull type** monitoring.
└── ~/
└── docker/
└── prometheus/
├──── grafana_data/
├──── prometheus_data/
├── docker-compose.yml
├── .env
└── prometheus.yml
├── 🗁 grafana_data/
├── 🗁 prometheus_data/
├── 🗋 docker-compose.yml
├── 🗋 .env
└── 🗋 prometheus.yml
```
* `grafana_data/` - a directory where grafana stores its data
@ -85,16 +90,16 @@ The directories are created by docker compose on the first run.
# docker-compose
* **Prometheus** - Container with some extra commands run at the start up.
Setting stuff like storage, data rentetion (500hours - 20 days)...
Bind mounted prometheus_data for persistent storage
and `prometheus.yml` for some basic configuration.
* **Grafana** - Cotainer, bind mounted directory for persistent data storage
* **NodeExporter** - an exporter for linux machines,
* **Prometheus** - The official image used. Few extra commands passing configuration.
Of note is 500 hours (\~20days) retention policy.
* **Grafana** - The official image used. Bind mounted directory
for persistent data storage. User sets as root, as it solves issues I am
lazy to investigate.
* **NodeExporter** - An exporter for linux machines,
in this case gathering the metrics of the linux machine runnig docker,
like uptime, cpu load, memory use, network bandwidth use, disk space,...<br>
Also bind mount of some system directories to have access to required info.
* **cAdvisor** - an exporter for gathering docker **containers** metrics,
* **cAdvisor** - An exporter for gathering docker **containers** metrics,
showing cpu, memory, network use of each container<br>
Runs in `privileged` mode and has some bind mounts of system directories
to have access to required info.
@ -126,7 +131,7 @@ services:
- ./prometheus_data:/prometheus
- ./prometheus.yml:/etc/prometheus/prometheus.yml
expose:
- 9090:9090
- 9090
labels:
org.label-schema.group: "monitoring"
@ -141,7 +146,7 @@ services:
volumes:
- ./grafana_data:/var/lib/grafana
expose:
- 3000
- "3000"
labels:
org.label-schema.group: "monitoring"
@ -161,7 +166,7 @@ services:
- /sys:/host/sys:ro
- /:/rootfs:ro
expose:
- 9100
- "9100"
labels:
org.label-schema.group: "monitoring"
@ -181,7 +186,7 @@ services:
- /var/lib/docker:/var/lib/docker:ro
- /cgroup:/cgroup:ro #doesn't work on MacOS only for Linux
expose:
- 3000
- "3000"
labels:
org.label-schema.group: "monitoring"
@ -195,7 +200,6 @@ networks:
```bash
# GENERAL
MY_DOMAIN=example.com
DOCKER_MY_NETWORK=caddy_net
TZ=Europe/Bratislava
@ -215,13 +219,13 @@ GF_SMTP_PASSWORD=xzu0dfFhn3eqa
Which is named in the `.env` file.</br>
If one does not exist yet: `docker network create caddy_net`
# prometheus.yml
## prometheus.yml
[Official documentation.](https://prometheus.io/docs/prometheus/latest/configuration/configuration/)
Contains the bare minimum setup of targets from where metrics are to be pulled.<br>
Stefanprodan [gives](https://github.com/stefanprodan/dockprom/blob/master/prometheus/prometheus.yml)
a custom shorter scrape intervals, but I feel thats not really
a custom shorter scrape intervals, but I thats not really
[necessary](https://www.robustperception.io/keep-it-simple-scrape_interval-id/).
`prometheus.yml`
@ -244,13 +248,13 @@ scrape_configs:
- targets: ['localhost:9090']
```
# Reverse proxy
## Reverse proxy
Caddy v2 is used, details
[here](https://github.com/DoTheEvo/selfhosted-apps-docker/tree/master/caddy_v2).</br>
`Caddyfile`
```
```php
graf.{$MY_DOMAIN} {
reverse_proxy grafana:3000
}
@ -260,7 +264,7 @@ prom.{$MY_DOMAIN} {
}
```
# First run and Grafana configuration
## First run and Grafana configuration
* login admin/admin to `graf.example.com`, change the password
* add Prometheus as a Data source in configuration<br>
@ -270,23 +274,29 @@ prom.{$MY_DOMAIN} {
These dashboards are the preconfigured ones from
[stefanprodan/dockprom](https://github.com/stefanprodan/dockprom)
with few changes.<br>
`docker_host.json` did not show free disk space, it needed `fstype` changed from
`aufs` to `ext4`. Also [a fix](https://github.com/stefanprodan/dockprom/issues/18#issuecomment-487023049)
`docker_host.json` did not show free disk space for me, had to change `fstype`
from `aufs` to `ext4`.
Also [a fix](https://github.com/stefanprodan/dockprom/issues/18#issuecomment-487023049)
for host network monitoring not showing traffick. And in all of them
the time interval is set to show last 1h instead of last 15m
the default time interval is set to show last 1h instead of last 15m
* **docker_host.json** - dashboard showing linux host machine metrics
* **docker_containers.json** - dashboard showing docker containers metrics,
except the ones labeled as `monitoring` in the compose file
* **monitoring_services.json** - dashboar showing docker containers metrics
of containers that are labeled `monitoring`
* **minecraft_logs.json** - comes to play with Loki later.
![interface-pic](https://i.imgur.com/wzwgBkp.png)
---
---
<details>
<summary><h1>Pushgateway</h1></summary>
# Pushgateway
It gives freedom to push information in to prometheus from anywhere. No need
to be on the same newtork or doing VPN. Just a html post request
to pushgateway url.
The setup and real world use of pushgateway, along with small steps
when learning it are in the repo -
@ -295,22 +305,19 @@ Including pushing information from windows powershell.
![veeam-dash](https://i.imgur.com/dUyzuyl.png)
</details>
---
---
<details>
<summary><h1>Alertmanager</h1></summary>
# Alertmanager
Several changes are needed
Several changes are needed:
- New container - `alertmanager` added to the compose file.
- New file - `alertmanager.yml` bind mounted in the alertmanager container.<br>
This file contains configuration about where and how to deliver alerts.<br>
A selfhosted
[ntfy](https://github.com/DoTheEvo/selfhosted-apps-docker/tree/master/gotify-ntfy-signal)
webhook is used that gets alerts to a phone app.
webhook is used that sends alerts to a phone app.
- New file - `alert.rules` mounted in to prometheus container<br>
This files defines when value of some metric becomes an alert event.
- Changed file - `prometheus.yml` added `alerting` section
@ -343,7 +350,7 @@ Including pushing information from windows powershell.
- ./prometheus.yml:/etc/prometheus/prometheus.yml
- ./alert.rules:/etc/prometheus/rules/alert.rules
expose:
- 9090:9090
- "9090"
labels:
org.label-schema.group: "monitoring"
@ -358,7 +365,7 @@ Including pushing information from windows powershell.
volumes:
- ./grafana_data:/var/lib/grafana
expose:
- 3000
- "3000"
labels:
org.label-schema.group: "monitoring"
@ -378,7 +385,7 @@ Including pushing information from windows powershell.
- /sys:/host/sys:ro
- /:/rootfs:ro
expose:
- 9100
- "9100"
labels:
org.label-schema.group: "monitoring"
@ -398,7 +405,7 @@ Including pushing information from windows powershell.
- /var/lib/docker:/var/lib/docker:ro
- /cgroup:/cgroup:ro #doesn't work on MacOS only for Linux
expose:
- 3000
- "3000"
labels:
org.label-schema.group: "monitoring"
@ -415,7 +422,7 @@ Including pushing information from windows powershell.
- '--config.file=/etc/alertmanager.yml'
- '--storage.path=/alertmanager'
expose:
- 9093
- "9093"
labels:
org.label-schema.group: "monitoring"
@ -500,59 +507,56 @@ Including pushing information from windows powershell.
```
</details>
test:<br>
`curl -H 'Content-Type: application/json' -d '[{"labels":{"alertname":"blabla"}}]' https://alert.example.com/api/v1/alerts`
reload rules
`curl -X POST http://admin:admin@<host-ip>:9090/-/reload`
[stefanprodan/dockprom](https://github.com/stefanprodan/dockprom#define-alerts)
has more detailed section on alerting worth checking out.
</details>
*Useful commands*
---
---
* test alert:<br>
`curl -H 'Content-Type: application/json' -d '[{"labels":{"alertname":"blabla"}}]' https://alert.example.com/api/v1/alerts`
* reload rules:<br>
`curl -X POST http://admin:admin@<host-ip>:9090/-/reload`
<details>
<summary><h1>Loki</h1></summary>
# Loki
Loki is made by the grafana team, and they call it Prometheus for logs.<br>
It is a **push** type monitoring, where an agent - **promtail**
pushes logs on to a Loki instance, or **loki-docker-driver** is installed
Loki is made by the grafana team. It's called a Prometheus for logs.<br>
It is a **push** type monitoring, where most of the time an agent - **promtail**
pushes logs on to a Loki instance.
For docker containers theres also an option to to install **loki-docker-driver**
on a docker host and log pushing is set either globally in /etc/docker/daemon.json
or per container in compose files.
What this example will set out to do is monitor logs of a
[minecraft server.](https://github.com/DoTheEvo/selfhosted-apps-docker/tree/master/minecraft)
Have some dashboards about logs and have an alert when a player joins.<br>
Its just proof of concept, as there are prometheus exporters for minecraft,
so digging through logs to get alert is less optimal.
There will be **two examples** of Loki usage.
A **minecraft server** and a **caddy revers proxy**, both docker containers.
What needs to be done:
But first to add Loki to the current grafana, prometheus stack:
- New container - `loki` added to the compose file. Loki stores logs and makes
* New container - `loki` added to the compose file. Loki stores logs and makes
them available for grafana to visualize.
- New file - `loki-docker-config.yml` bind mounted in the loki container.<br>
The file is [all default](https://github.com/grafana/loki/tree/main/cmd/loki),
except for alertmanager url.<br>
- install [loki-docker-driver](https://grafana.com/docs/loki/latest/clients/docker-driver/)<br>
* New file - `loki-docker-config.yml` bind mounted in the loki container.<br>
The file is all default from the
[official example](https://github.com/grafana/loki/tree/main/cmd/loki),
except for alertmanager url.
* install [loki-docker-driver](https://grafana.com/docs/loki/latest/clients/docker-driver/)<br>
`docker plugin install grafana/loki-docker-driver:latest --alias loki --grant-all-permissions`<br>
- adding logging section to compose files of a containers
that should be monitored, like so:<br>
```
* adding logging section to compose files of a containers
that should be monitored.<br>
<details>
<summary>example compose with logging enabled</summary>
```yml
services:
whoami:
image: "containous/whoami"
container_name: "whoami"
hostname: "whoami"
ports:
- 88:80
logging:
driver: "loki"
options:
loki-url: "http://localhost:3100/loki/api/v1/push"
```
Promtail can be used to push logs to loki, older version of this readme
has details - [here.](https://github.com/DoTheEvo/selfhosted-apps-docker/tree/70830ac8679d81e73265c8522e06312db4905f3b/prometheus_grafana)
</details>
<details>
<summary>docker-compose.yml</summary>
@ -581,7 +585,7 @@ Including pushing information from windows powershell.
- ./prometheus.yml:/etc/prometheus/prometheus.yml
- ./alert.rules:/etc/prometheus/rules/alert.rules
expose:
- 9090:9090
- "9090"
labels:
org.label-schema.group: "monitoring"
@ -596,7 +600,7 @@ Including pushing information from windows powershell.
volumes:
- ./grafana_data:/var/lib/grafana
expose:
- 3000
- "3000"
labels:
org.label-schema.group: "monitoring"
@ -616,7 +620,7 @@ Including pushing information from windows powershell.
- /sys:/host/sys:ro
- /:/rootfs:ro
expose:
- 9100
- "9100"
labels:
org.label-schema.group: "monitoring"
@ -636,7 +640,7 @@ Including pushing information from windows powershell.
- /var/lib/docker:/var/lib/docker:ro
- /cgroup:/cgroup:ro #doesn't work on MacOS only for Linux
expose:
- 3000
- "3000"
labels:
org.label-schema.group: "monitoring"
@ -653,7 +657,7 @@ Including pushing information from windows powershell.
- '--config.file=/etc/alertmanager.yml'
- '--storage.path=/alertmanager'
expose:
- 9093
- "9093"
labels:
org.label-schema.group: "monitoring"
@ -719,10 +723,16 @@ Including pushing information from windows powershell.
```
</details>
![logo](https://i.imgur.com/MHYmxPi.png)
### Logs in grafana
### Minecraft example
Minecraft example will set out to monitor logs of a
[minecraft server.](https://github.com/DoTheEvo/selfhosted-apps-docker/tree/master/minecraft)
Make some dashboards showing logs volume and sending a notification when a player joins.<br>
It's just a proof of concept, as there are prometheus exporters for minecraft,
so digging through logs is usually not necessary.
![logo](https://i.imgur.com/MHYmxPi.png)
Now with the driver installed, config files in place, compose edited,..
@ -748,13 +758,14 @@ Now with the driver installed, config files in place, compose edited,..
- stack series - normal
- color scheme - single color
[performance tips](https://www.youtube.com/watch?v=YED8XIm0YPs)
for grafana loki queries
### Alerts for Loki
* Alerts section in grafana
* Alert rules, new alert
- 1. Set a query and alert condition
- **A**
- now-5min to now
* Alerting > Alert rules > New alert rule
- **1 Set a query and alert condition**
- **A**; Loki; now-5min to now
- container_name=minecraft
- line contains=joined the game
- **B**
@ -766,15 +777,15 @@ Now with the driver installed, config files in place, compose edited,..
- Input=B
- is above 0
- click - Make this the alert condition
- 2. Alert evaluation behavior
- **2 Alert evaluation behavior**
- evaluate every 5m for 0s
- Configure no data and error handling
- Alert state if no data or all values are null=OK
- 3. Add details for your alert
- **3 Add details for your alert**
- Rule name=Minecraft-player-joined
- Folder, add new, "Alerts"
- Group, add new, "Docker"
- 4. Notifications
- **4 Notifications**
- nothing
- Save and exit
* **Contact points**
@ -790,28 +801,10 @@ Now with the driver installed, config files in place, compose edited,..
# Caddy monitoring
</details>
---
---
<details>
<summary><h1>Caddy monitoring</h1></summary>
</details>
---
---
Described in
[the caddy guide](https://github.com/DoTheEvo/selfhosted-apps-docker/tree/master/caddy_v2)
# Update

Loading…
Cancel
Save