Observability Operations
Modern infrastructure observability with a CLI-first approach. Metrics, logs, and alerting.
The Three Pillars
| Pillar | Tool | Purpose |
|---|---|---|
Metrics |
Prometheus |
Time-series data, resource usage, counters |
Logs |
Loki + Promtail |
Centralized log aggregation and search |
Alerting |
Alertmanager |
Notifications, routing, silencing |
Quick Reference
# Prometheus
promtool check config prometheus.yml
curl localhost:{prometheus-port}/api/v1/targets
# Grafana CLI
grafana-cli plugins list
grafana-cli admin reset-admin-password newpassword
# Loki
logcli query '{job="varlogs"}'
logcli labels
# Alertmanager
amtool check-config alertmanager.yml
amtool alert query
amtool silence add alertname=HighMemory --duration=1h
Stack Architecture
┌─────────────────────────────────────────────────────────────────┐
│ Visualization │
│ ┌──────────────────┐ │
│ │ Grafana │ │
│ │ (Dashboards) │ │
│ └────────┬─────────┘ │
├──────────────────────────────┼──────────────────────────────────┤
│ │ │
│ ┌──────────────────┼──────────────────┐ │
│ │ │ │ │
│ ┌──────▼──────┐ ┌──────▼──────┐ ┌──────▼──────┐ │
│ │ Prometheus │ │ Loki │ │Alertmanager │ │
│ │ (Metrics) │ │ (Logs) │ │ (Alerts) │ │
│ └──────▲──────┘ └──────▲──────┘ └─────────────┘ │
│ │ │ │
├───────────┼──────────────────┼──────────────────────────────────┤
│ │ │ Collection Layer │
│ ┌──────┴──────┐ ┌──────┴──────┐ │
│ │ Exporters │ │ Promtail │ │
│ │ node_export │ │ (Log ship) │ │
│ └──────▲──────┘ └──────▲──────┘ │
│ │ │ │
├───────────┴──────────────────┴──────────────────────────────────┤
│ Infrastructure │
│ ┌─────────┐ ┌─────────┐ ┌─────────┐ ┌─────────┐ │
│ │ Servers │ │ Network │ │ K8s │ │ Apps │ │
│ └─────────┘ └─────────┘ └─────────┘ └─────────┘ │
└─────────────────────────────────────────────────────────────────┘
Documentation Index
Metrics (Prometheus)
| Page | Description |
|---|---|
Architecture, concepts, when to use |
|
Systemd, Podman, configuration |
|
Query language, selectors, functions |
|
node_exporter, blackbox, custom metrics |
Logs (Loki)
| Page | Description |
|---|---|
Architecture, comparison to ELK |
|
Log query language, filters, aggregations |
|
Log collection agent configuration |
Visualization (Grafana)
| Page | Description |
|---|---|
Capabilities, CLI administration |
|
Building, exporting, JSON models |
|
Dashboards and datasources as code |