Observability Operations

Project Summary

Field Value

PRJ ID

PRJ-SPOKE-010

Owner

Evan Rosado

Priority

P2 (Medium)

Status

Planned

Repository

~/atelier/_bibliotheca/domus-o11y-ops

Antora Component

o11y

Antora Title

Observability Operations

Category

Observability

2026 Commits

8

Site URL

docs.domusdigitalis.dev/o11y/

Purpose

The Observability Operations component documents the metrics, logging, and alerting stack: Prometheus for metrics collection, Grafana for visualization, Loki for log aggregation, and Alertmanager for notification routing.

It covers PromQL query patterns, Grafana dashboard design, exporter configuration (node_exporter, blackbox_exporter), and alert rule development for the Domus infrastructure.

Scope

In Scope

  • Prometheus server configuration and scrape targets

  • PromQL query patterns and recording rules

  • Grafana dashboard design and provisioning

  • Loki log aggregation and LogQL queries

  • Alertmanager notification routing

  • Node exporter and blackbox exporter deployment

  • k3s ServiceMonitor and PodMonitor CRDs

  • Observability-as-code patterns

Out of Scope

  • SIEM detection engineering (covered by siem-ops)

  • Zabbix agent configuration (covered by infra-ops)

  • Application-level instrumentation (covered by respective repos)

Status

Indicator Detail

Activity Level

Planned — 8 commits, early scaffolding

Maturity

Early — component structure and attribute system defined

Last Activity

2026

Key Milestone

Prometheus/Grafana/Loki attribute definitions established

Deployment Status

Monitoring stack running on k3s, documentation in early stages

Metadata

Field Value

PRJ ID

PRJ-SPOKE-010

Author

Evan Rosado

Date Created

2026-03-30

Last Updated

2026-03-30

Status

Planned

Next Review

2026-04-15