Observability Operations

Modern infrastructure observability with a CLI-first approach. Metrics, logs, and alerting.

The Three Pillars

Pillar Tool Purpose

Metrics

Prometheus

Time-series data, resource usage, counters

Logs

Loki + Promtail

Centralized log aggregation and search

Alerting

Alertmanager

Notifications, routing, silencing

Quick Reference

# Prometheus
promtool check config prometheus.yml
curl localhost:{prometheus-port}/api/v1/targets

# Grafana CLI
grafana-cli plugins list
grafana-cli admin reset-admin-password newpassword

# Loki
logcli query '{job="varlogs"}'
logcli labels

# Alertmanager
amtool check-config alertmanager.yml
amtool alert query
amtool silence add alertname=HighMemory --duration=1h

Stack Architecture

┌─────────────────────────────────────────────────────────────────┐
│                        Visualization                             │
│                     ┌──────────────────┐                        │
│                     │     Grafana      │                        │
│                     │   (Dashboards)   │                        │
│                     └────────┬─────────┘                        │
├──────────────────────────────┼──────────────────────────────────┤
│                              │                                   │
│           ┌──────────────────┼──────────────────┐               │
│           │                  │                  │               │
│    ┌──────▼──────┐    ┌──────▼──────┐    ┌──────▼──────┐       │
│    │ Prometheus  │    │    Loki     │    │Alertmanager │       │
│    │  (Metrics)  │    │   (Logs)    │    │  (Alerts)   │       │
│    └──────▲──────┘    └──────▲──────┘    └─────────────┘       │
│           │                  │                                   │
├───────────┼──────────────────┼──────────────────────────────────┤
│           │                  │           Collection Layer        │
│    ┌──────┴──────┐    ┌──────┴──────┐                           │
│    │  Exporters  │    │  Promtail   │                           │
│    │ node_export │    │  (Log ship) │                           │
│    └──────▲──────┘    └──────▲──────┘                           │
│           │                  │                                   │
├───────────┴──────────────────┴──────────────────────────────────┤
│                        Infrastructure                            │
│    ┌─────────┐  ┌─────────┐  ┌─────────┐  ┌─────────┐          │
│    │ Servers │  │ Network │  │   K8s   │  │   Apps  │          │
│    └─────────┘  └─────────┘  └─────────┘  └─────────┘          │
└─────────────────────────────────────────────────────────────────┘

Documentation Index

Metrics (Prometheus)

Page Description

Overview

Architecture, concepts, when to use

Installation

Systemd, Podman, configuration

PromQL

Query language, selectors, functions

Exporters

node_exporter, blackbox, custom metrics

Logs (Loki)

Page Description

Overview

Architecture, comparison to ELK

LogQL

Log query language, filters, aggregations

Promtail

Log collection agent configuration

Visualization (Grafana)

Page Description

Overview

Capabilities, CLI administration

Dashboards

Building, exporting, JSON models

Provisioning

Dashboards and datasources as code

Alerting

Page Description

Alertmanager

Configuration, routing, receivers

Alert Rules

Prometheus alerting rules syntax