update

1 year ago · 094e238643
parent 3659b54693
commit 094e238643
1 changed files with 503 additions and 510 deletions
--- a/prometheus_grafana/readme.md
+++ b/prometheus_grafana/readme.md
@ -2,7 +2,7 @@

 ###### guide-by-example

-![logo](https://i.imgur.com/e03aF8d.png)
+![logo](https://i.imgur.com/q41QfyI.png)

 WORK IN PROGRESS<br>
 Loki and caddy monitoring parts are not finished yet
@ -11,20 +11,26 @@ Loki and caddy monitoring parts are not finished yet

 Monitoring of the host and the running cointaners.

-* [Official Prometheus site](https://prometheus.io/)
-* [Official Grafana site](https://grafana.com/)
+* [Official Prometheus](https://prometheus.io/)
+* [Official Grafana](https://grafana.com/)
+* [Official Loki](https://grafana.com/oss/loki/)

-Most of the stuff here is based off the magnificent
-[stefanprodan/dockprom.](https://github.com/stefanprodan/dockprom)</br>
-So maybe just go play with that.
+Monitoring in this case means gathering and showing information on how services
+or machines or containers are running. Can be cpu, io, ram, disk use... 
+can be number of http requests,or results of backups.<br>
+Prometheus deals with metrics. Loki deals with logs. Grafana is there to show
+the data on a dashboard.
+
+Lot of the prometheus stuff here is based off the magnificent
+[stefanprodan/dockprom.](https://github.com/stefanprodan/dockprom)

 # Chapters

-* **[Core prometheus+grafana](#Overview)** - to get nice dashboards with metrics from docker host and containers
-* **[Pushgateway](#Pushgateway)** - how to use it to allow pushing metrics in to prometheus from anywhere
-* **[Alertmanager](#Alertmanager)** - how to use it for notifications
-* **[Loki](#Loki)** - how to do the above things but for logs, not just metrics
-* **[Caddy monitoring](#Caddy_monitoring)** - adding dashboard for reverse proxy info
+* **[Core prometheus+grafana](#Overview)** - nice dashboards with metrics of docker host and containers
+* **[Pushgateway](#Pushgateway)** - push data to prometheus from anywhere
+* **[Alertmanager](#Alertmanager)** - setting alerts and getting notifications
+* **[Loki](#Loki)** - all of the above but for log files
+* **[Caddy monitoring](#Caddy_monitoring)** - monitoring a reverse proxy

 # Overview

@ -32,7 +38,7 @@ So maybe just go play with that.

 Prometheus is an open source system for monitoring and alerting,
 written in golang.<br>
-It periodicly collects metrics from configured targets,
+It periodically collects metrics from configured targets,
 makes these metrics available for visualization, and can trigger alerts.<br>
 Prometheus is relatively young project, it is a **pull type** monitoring.

@ -49,12 +55,11 @@ Prometheus is relatively young project, it is a **pull type** monitoring.
     and exposes them at an endpoint so they can be pulled
 * **Alertmanager** - responsible for handling alerts from Prometheus Server,
  and sending notifications through email, slack, pushover,..
-  In this setup [ntfy](https://github.com/DoTheEvo/selfhosted-apps-docker/tree/master/gotify-ntfy-signal) webhook will be used.<br>
-  Grafana comes with own alerts, but grafana kinda feels... b-tier
-* **pushgateway** - allows push type of monitoring.
-  Should not be overused as it goes against the pull philosophy of prometheus.
-  Most commonly it is used to collect data from batch jobs, or from services
-  that have short execution time. Like a backup script.<br>
+  **In this setup [ntfy](https://github.com/DoTheEvo/selfhosted-apps-docker/tree/master/gotify-ntfy-signal)
+  webhook will be used.**
+* **pushgateway** - allows push type of monitoring. Meaning a machine anywhere
+  in the world can push data in to your prometheus. Should not be overused
+  as it goes against the pull philosophy of prometheus.
 * **Grafana** - for web UI visualization of the collected metrics


@ -67,11 +72,11 @@ Prometheus is relatively young project, it is a **pull type** monitoring.
 └── ~/
     └── docker/
         └── prometheus/
-             ├──── grafana_data/
-             ├──── prometheus_data/
-             ├── docker-compose.yml
-             ├── .env
-             └── prometheus.yml
+             ├── 🗁 grafana_data/
+             ├── 🗁 prometheus_data/
+             ├── 🗋 docker-compose.yml
+             ├── 🗋 .env
+             └── 🗋 prometheus.yml
 ```

 * `grafana_data/` - a directory where grafana stores its data
@ -85,16 +90,16 @@ The directories are created by docker compose on the first run.

 # docker-compose

-* **Prometheus** - Container with some extra commands run at the start up.
-  Setting stuff like storage, data rentetion (500hours - 20 days)...
-  Bind mounted prometheus_data for persistent storage
-  and `prometheus.yml` for some basic configuration.
-* **Grafana** - Cotainer, bind mounted directory for persistent data storage
-* **NodeExporter** - an exporter for linux machines,
+* **Prometheus** - The official image used. Few extra commands passing configuration.
+  Of note is 500 hours (\~20days) retention policy.
+* **Grafana** - The official image used. Bind mounted directory
+  for persistent data storage. User sets as root, as it solves issues I am
+  lazy to investigate.
+* **NodeExporter** - An exporter for linux machines,
  in this case gathering the metrics of the linux machine runnig docker,
  like uptime, cpu load, memory use, network bandwidth use, disk space,...<br>
  Also bind mount of some system directories to have access to required info.
-* **cAdvisor** - an exporter for gathering docker **containers** metrics,
+* **cAdvisor** - An exporter for gathering docker **containers** metrics,
  showing cpu, memory, network use of each container<br>
  Runs in `privileged` mode and has some bind mounts of system directories
  to have access to required info.
@ -126,7 +131,7 @@ services:
      - ./prometheus_data:/prometheus
      - ./prometheus.yml:/etc/prometheus/prometheus.yml
    expose:
-      - 9090:9090
+      - 9090
    labels:
      org.label-schema.group: "monitoring"

@ -141,7 +146,7 @@ services:
    volumes:
      - ./grafana_data:/var/lib/grafana
    expose:
-      - 3000
+      - "3000"
    labels:
      org.label-schema.group: "monitoring"

@ -161,7 +166,7 @@ services:
      - /sys:/host/sys:ro
      - /:/rootfs:ro
    expose:
-      - 9100
+      - "9100"
    labels:
      org.label-schema.group: "monitoring"

@ -181,7 +186,7 @@ services:
      - /var/lib/docker:/var/lib/docker:ro
      - /cgroup:/cgroup:ro #doesn't work on MacOS only for Linux
    expose:
-      - 3000
+      - "3000"
    labels:
      org.label-schema.group: "monitoring"

@ -195,7 +200,6 @@ networks:

 ```bash
 # GENERAL
-MY_DOMAIN=example.com
 DOCKER_MY_NETWORK=caddy_net
 TZ=Europe/Bratislava

@ -215,13 +219,13 @@ GF_SMTP_PASSWORD=xzu0dfFhn3eqa
 Which is named in the `.env` file.</br>
 If one does not exist yet: `docker network create caddy_net`

-# prometheus.yml
+## prometheus.yml

 [Official documentation.](https://prometheus.io/docs/prometheus/latest/configuration/configuration/)

 Contains the bare minimum setup of targets from where metrics are to be pulled.<br>
 Stefanprodan [gives](https://github.com/stefanprodan/dockprom/blob/master/prometheus/prometheus.yml)
-a custom shorter scrape intervals, but I feel thats not really
+a custom shorter scrape intervals, but I thats not really
 [necessary](https://www.robustperception.io/keep-it-simple-scrape_interval-id/).

 `prometheus.yml`
@ -244,13 +248,13 @@ scrape_configs:
      - targets: ['localhost:9090']
 ```

-# Reverse proxy
+## Reverse proxy

 Caddy v2 is used, details
 [here](https://github.com/DoTheEvo/selfhosted-apps-docker/tree/master/caddy_v2).</br>

 `Caddyfile`
-```
+```php
 graf.{$MY_DOMAIN} {
  reverse_proxy grafana:3000
 }
@ -260,7 +264,7 @@ prom.{$MY_DOMAIN} {
 }
 ```

-# First run and Grafana configuration
+## First run and Grafana configuration

 * login admin/admin to `graf.example.com`, change the password
 * add Prometheus as a Data source in configuration<br>
@ -270,23 +274,29 @@ prom.{$MY_DOMAIN} {
 These dashboards are the preconfigured ones from
 [stefanprodan/dockprom](https://github.com/stefanprodan/dockprom)
 with few changes.<br>
-`docker_host.json` did not show free disk space, it needed `fstype` changed from
-`aufs` to `ext4`. Also [a fix](https://github.com/stefanprodan/dockprom/issues/18#issuecomment-487023049)
+`docker_host.json` did not show free disk space for me, had to change `fstype`
+from `aufs` to `ext4`.
+Also [a fix](https://github.com/stefanprodan/dockprom/issues/18#issuecomment-487023049)
 for host network monitoring not showing traffick. And in all of them
-the time interval is set to show last 1h instead of last 15m
+the default time interval is set to show last 1h instead of last 15m

 * **docker_host.json** - dashboard showing linux host machine metrics
 * **docker_containers.json** - dashboard showing docker containers metrics,
  except the ones labeled as `monitoring` in the compose file
 * **monitoring_services.json** - dashboar showing docker containers metrics
  of containers that are labeled `monitoring`
+* **minecraft_logs.json** - comes to play with Loki later.

 ![interface-pic](https://i.imgur.com/wzwgBkp.png)

+---
 ---

-<details>
-<summary><h1>Pushgateway</h1></summary>
+# Pushgateway
+
+It gives freedom to push information in to prometheus from anywhere. No need
+to be on the same newtork or doing VPN. Just a html post request
+to pushgateway url.

 The setup and real world use of pushgateway, along with small steps 
 when learning it are in the repo - 
@ -295,22 +305,19 @@ Including pushing information from windows powershell.

 ![veeam-dash](https://i.imgur.com/dUyzuyl.png)

-</details>
-
 ---
 ---

-<details>
-  <summary><h1>Alertmanager</h1></summary>
+# Alertmanager

-  Several changes are needed
+Several changes are needed:

 - New container - `alertmanager` added to the compose file.
 - New file - `alertmanager.yml` bind mounted in the alertmanager container.<br>
  This file contains configuration about where and how to deliver alerts.<br>
  A selfhosted
  [ntfy](https://github.com/DoTheEvo/selfhosted-apps-docker/tree/master/gotify-ntfy-signal)
-    webhook is used that gets alerts to a phone app.
+  webhook is used that sends alerts to a phone app.
 - New file - `alert.rules` mounted in to prometheus container<br>
  This files defines when value of some metric becomes an alert event.
 - Changed file - `prometheus.yml` added `alerting` section
@ -343,7 +350,7 @@ Including pushing information from windows powershell.
      - ./prometheus.yml:/etc/prometheus/prometheus.yml
      - ./alert.rules:/etc/prometheus/rules/alert.rules
    expose:
-        - 9090:9090
+      - "9090"
    labels:
      org.label-schema.group: "monitoring"

@ -358,7 +365,7 @@ Including pushing information from windows powershell.
    volumes:
      - ./grafana_data:/var/lib/grafana
    expose:
-        - 3000
+      - "3000"
    labels:
      org.label-schema.group: "monitoring"

@ -378,7 +385,7 @@ Including pushing information from windows powershell.
      - /sys:/host/sys:ro
      - /:/rootfs:ro
    expose:
-        - 9100
+      - "9100"
    labels:
      org.label-schema.group: "monitoring"

@ -398,7 +405,7 @@ Including pushing information from windows powershell.
      - /var/lib/docker:/var/lib/docker:ro
      - /cgroup:/cgroup:ro #doesn't work on MacOS only for Linux
    expose:
-        - 3000
+      - "3000"
    labels:
      org.label-schema.group: "monitoring"

@ -415,7 +422,7 @@ Including pushing information from windows powershell.
      - '--config.file=/etc/alertmanager.yml'
      - '--storage.path=/alertmanager'
    expose:
-        - 9093
+      - "9093"
    labels:
      org.label-schema.group: "monitoring"

@ -500,59 +507,56 @@ Including pushing information from windows powershell.
 ```
 </details>

-  test:<br>
-  `curl -H 'Content-Type: application/json' -d '[{"labels":{"alertname":"blabla"}}]' https://alert.example.com/api/v1/alerts`

-  reload rules
-  `curl -X POST http://admin:admin@<host-ip>:9090/-/reload`
+[stefanprodan/dockprom](https://github.com/stefanprodan/dockprom#define-alerts)
+has more detailed section on alerting worth checking out.

-</details>
+*Useful commands*

---
---
+* test alert:<br>
+  `curl -H 'Content-Type: application/json' -d '[{"labels":{"alertname":"blabla"}}]' https://alert.example.com/api/v1/alerts`
+* reload rules:<br>
+  `curl -X POST http://admin:admin@<host-ip>:9090/-/reload`

-<details>
-  <summary><h1>Loki</h1></summary>
+# Loki

-  Loki is made by the grafana team, and they call it Prometheus for logs.<br>
-  It is a **push** type monitoring, where an agent - **promtail**
-  pushes logs on to a Loki instance, or **loki-docker-driver** is installed
+Loki is made by the grafana team. It's called a Prometheus for logs.<br>
+It is a **push** type monitoring, where most of the time an agent - **promtail**
+pushes logs on to a Loki instance.
+For docker containers theres also an option to to install **loki-docker-driver**
 on a docker host and log pushing is set either globally in /etc/docker/daemon.json
 or per container in compose files.

-  What this example will set out to do is monitor logs of a
-  [minecraft server.](https://github.com/DoTheEvo/selfhosted-apps-docker/tree/master/minecraft)
-  Have some dashboards about logs and have an alert when a player joins.<br>
-  Its just proof of concept, as there are prometheus exporters for minecraft,
-  so digging through logs to get alert is less optimal.
+There will be **two examples** of Loki usage.
+A **minecraft server** and a **caddy revers proxy**, both docker containers.

-  What needs to be done:
+But first to add Loki to the current grafana, prometheus stack:

-  - New container - `loki` added to the compose file. Loki stores logs and makes
+* New container - `loki` added to the compose file. Loki stores logs and makes
  them available for grafana to visualize.
-  - New file - `loki-docker-config.yml` bind mounted in the loki container.<br>
-    The file is [all default](https://github.com/grafana/loki/tree/main/cmd/loki),
-    except for alertmanager url.<br>
-  - install [loki-docker-driver](https://grafana.com/docs/loki/latest/clients/docker-driver/)<br>
+* New file - `loki-docker-config.yml` bind mounted in the loki container.<br>
+  The file is all default from the 
+  [official example](https://github.com/grafana/loki/tree/main/cmd/loki),
+  except for alertmanager url.
+* install [loki-docker-driver](https://grafana.com/docs/loki/latest/clients/docker-driver/)<br>
  `docker plugin install grafana/loki-docker-driver:latest --alias loki --grant-all-permissions`<br>
-  - adding logging section to compose files of a containers 
-    that should be monitored, like so:<br>
-    ```
+* adding logging section to compose files of a containers 
+  that should be monitored.<br>
+  
+<details>
+<summary>example compose with logging enabled</summary>
+```yml
 services:
  whoami:
    image: "containous/whoami"
    container_name: "whoami"
    hostname: "whoami"
-        ports:
-         - 88:80
    logging:
      driver: "loki"
      options:
        loki-url: "http://localhost:3100/loki/api/v1/push"
 ```
-
-  Promtail can be used to push logs to loki, older version of this readme
-  has details - [here.](https://github.com/DoTheEvo/selfhosted-apps-docker/tree/70830ac8679d81e73265c8522e06312db4905f3b/prometheus_grafana)
+</details>

 <details>
  <summary>docker-compose.yml</summary>
@ -581,7 +585,7 @@ Including pushing information from windows powershell.
      - ./prometheus.yml:/etc/prometheus/prometheus.yml
      - ./alert.rules:/etc/prometheus/rules/alert.rules
    expose:
-        - 9090:9090
+      - "9090"
    labels:
      org.label-schema.group: "monitoring"

@ -596,7 +600,7 @@ Including pushing information from windows powershell.
    volumes:
      - ./grafana_data:/var/lib/grafana
    expose:
-        - 3000
+      - "3000"
    labels:
      org.label-schema.group: "monitoring"

@ -616,7 +620,7 @@ Including pushing information from windows powershell.
      - /sys:/host/sys:ro
      - /:/rootfs:ro
    expose:
-        - 9100
+      - "9100"
    labels:
      org.label-schema.group: "monitoring"

@ -636,7 +640,7 @@ Including pushing information from windows powershell.
      - /var/lib/docker:/var/lib/docker:ro
      - /cgroup:/cgroup:ro #doesn't work on MacOS only for Linux
    expose:
-        - 3000
+      - "3000"
    labels:
      org.label-schema.group: "monitoring"

@ -653,7 +657,7 @@ Including pushing information from windows powershell.
      - '--config.file=/etc/alertmanager.yml'
      - '--storage.path=/alertmanager'
    expose:
-        - 9093
+      - "9093"
    labels:
      org.label-schema.group: "monitoring"

@ -719,10 +723,16 @@ Including pushing information from windows powershell.
 ```
 </details>

+![logo](https://i.imgur.com/MHYmxPi.png)

-### Logs in grafana
+### Minecraft example
+
+Minecraft example will set out to monitor logs of a
+[minecraft server.](https://github.com/DoTheEvo/selfhosted-apps-docker/tree/master/minecraft)
+Make some dashboards showing logs volume and sending a notification when a player joins.<br>
+It's just a proof of concept, as there are prometheus exporters for minecraft,
+so digging through logs is usually not necessary.

-  ![logo](https://i.imgur.com/MHYmxPi.png)
  
 Now with the driver installed, config files in place, compose edited,..

@ -748,13 +758,14 @@ Now with the driver installed, config files in place, compose edited,..
  - stack series - normal
  - color scheme - single color

+[performance tips](https://www.youtube.com/watch?v=YED8XIm0YPs)
+for grafana loki queries 
+
 ### Alerts for Loki

-* Alerts section in grafana
-* Alert rules, new alert
-  - 1. Set a query and alert condition
-    - **A**
-    - now-5min to now
+* Alerting > Alert rules > New alert rule
+- **1 Set a query and alert condition**
+  - **A**; Loki; now-5min to now
  - container_name=minecraft
  - line contains=joined the game
  - **B**
@ -766,15 +777,15 @@ Now with the driver installed, config files in place, compose edited,..
  - Input=B
  - is above 0
  - click - Make this the alert condition
-  - 2. Alert evaluation behavior
+- **2 Alert evaluation behavior**
  - evaluate every 5m for 0s
  - Configure no data and error handling
  - Alert state if no data or all values are null=OK
-  - 3. Add details for your alert
+- **3 Add details for your alert**
  - Rule name=Minecraft-player-joined
  - Folder, add new, "Alerts"
  - Group, add new, "Docker"
-  - 4. Notifications
+- **4 Notifications**
  - nothing
 - Save and exit
 * **Contact points**
@ -790,28 +801,10 @@ Now with the driver installed, config files in place, compose edited,..



+# Caddy monitoring

-
-
-
-
-
-
-
-
-
-</details>
-
---
---
-
-<details>
-  <summary><h1>Caddy monitoring</h1></summary>
-
-</details>
-
---
---
+Described in
+[the caddy guide](https://github.com/DoTheEvo/selfhosted-apps-docker/tree/master/caddy_v2)

 # Update