Association Engine

Project Summary

A bidirectional association engine that makes implicit relationships between documents, projects, certifications, skills, and any other entity explicit and queryable. Built on the concept Brian Kernighan called one of the most important in computing: the associative array.

The domus ecosystem has 3,486 files with thousands of implicit connections — a project links to a certification, a certification links to an education track, an incident links to a change request, a skill compounds another skill. Today these connections exist as scattered xref: links and human memory. The association engine makes them a first-class data structure: a directed labeled multigraph with O(1) forward and reverse traversal.

The Concept

An associative array maps a name to a value — not a position, not an index, a name. That’s the shift Kernighan described: computers think in numbered slots, humans think in named relationships. The associative array bridges that gap.

In Python, this is a dict:

# A regular array — lookup by position (index 0, 1, 2)
tools = ["awk", "sed", "jq"]
tools[0]  # → "awk" — you need to know the position

# An associative array (dict) — lookup by name
tools = {
    "awk": "field extraction and text processing",
    "sed": "stream editing and substitution",
    "jq":  "JSON processing and transformation",
}
tools["awk"]  # → "field extraction..." — you ask by name, not position

The association engine extends this to bidirectional named relationships:

# Forward: what does CISSP connect to?
graph.forward["CISSP"]
# → {"blocks": ["remote-work", "consulting"], "requires": ["security-patterns"]}

# Reverse: what connects to remote-work?
graph.reverse["remote-work"]
# → {"blocked-by": ["CISSP"]}

Two dicts. One maps forward (source → targets), one maps backward (target → sources). Every associate() call writes to both. That’s the entire engine.

Why This Matters

The problem it solves:

When you complete a VyOS deployment, you manually update: the deployment page, the HA phases tracker, the personal infrastructure tracker, the active initiatives, and the carryover backlog. Five places. Same fact. That’s why things drift — the connections between them are in your head, not in a data structure.

With the association engine:

# One command captures the relationship
assoc new DEPLOY-2026-03-07-vyos completes ha-phase-3
assoc new ha-phase-3 updates personal-infra
assoc new ha-phase-3 updates active-initiatives

# Query: what did this deployment affect?
assoc query DEPLOY-2026-03-07-vyos
# → completes: [ha-phase-3]

# Reverse: what completed ha-phase-3?
assoc reverse ha-phase-3
# → completed-by: [DEPLOY-2026-03-07-vyos]

The learning vehicle:

This is where Python gets learned — not from a textbook, but by building a real tool that solves a real problem. Each phase teaches specific Python concepts:

Phase Python Concepts Learned

Phase 0: Setup

uv package manager, pyproject.toml, project structure, virtual environments

Phase 1: Core

Classes (class), methods (def), dicts ({}), type hints (str), self, init, return

Phase 2: YAML

File I/O (open, read, write), import, modules, pathlib.Path, error handling (try/except)

Phase 3: CLI

Third-party libraries (typer), decorators (@app.command()), argument parsing, main

Phase 4: API

FastAPI integration, HTTP methods, request/response models, dependency injection

Phase 5: Seed

Loops (for), list comprehensions, data modeling, real-world data wrangling

Phase 6: Go

Porting concepts to a new language — the concepts transfer, only syntax changes

What It Enables

# CLI — from the terminal
assoc query CISSP                    # What does CISSP connect to?
assoc reverse remote-work            # What blocks remote work?
assoc path CISSP consulting          # How does CISSP lead to consulting?
assoc list --relations               # What types of relationships exist?

# API — over HTTP (domus-api integration)
curl -s localhost:8080/associations/CISSP | jq
curl -s localhost:8080/associations/remote-work/reverse | jq

# Pipeline — compose with other tools
assoc list | xargs -I{} assoc query {} | jq -r '.blocks[]' | sort | uniq -c | sort -rn

Status

Phase Description Status Notes

Phase 0: Repository Setup

uv init, project structure, dev tooling

❌ Not started

7 commands. Creates the workspace.

Phase 1: Core Data Structure

AssociationGraph class — two dicts, one method, everything follows

❌ Not started

Pure Python. No dependencies. This is where dicts, classes, and methods get learned.

Phase 2: YAML Persistence

Load/save association graph from YAML files. File-backed, git-versioned.

❌ Not started

PyYAML. File I/O, pathlib, error handling.

Phase 3: CLI (Typer)

assoc new, assoc query, assoc reverse, assoc path, assoc list

❌ Not started

First standalone CLI tool. Decorators, argument parsing.

Phase 4: domus-api Integration

GET /associations/{key}, POST /associations, GET /associations/{key}/reverse

❌ Not started

FastAPI endpoints. Same DI pattern as domus-api.

Phase 5: Seed Data

Populate with actual domus-captures relationships — projects, certs, skills, patterns

❌ Not started

The real test: does the graph reveal connections you didn’t see?

Phase 6: Go Port

Rewrite core + CLI in Go for single-binary distribution

❌ Future

After Python version is proven. Kernighan’s language.

Assessment

The Problem

The domus ecosystem has 3,486 files with thousands of implicit connections. A project links to a certification. A certification links to an education track. An incident links to a change request. A skill compounds another skill. These connections exist as scattered xref: links, human memory, and manual tracker updates that drift within days.

When a VyOS deployment completes, five files need updating. When a certification status changes, three trackers need syncing. The connections are real but the system doesn’t know about them — the human is the integration layer.

The Solution

A bidirectional association engine based on Kernighan’s associative array concept. Two Python dicts — one forward, one reverse. Every associate() call writes to both. The graph becomes queryable from the terminal (assoc query CISSP) and over HTTP (curl localhost:8080/associations/CISSP).

Prerequisites

Requirement Description Status

Python 3.13

Already installed (domus-api uses it)

✅ Ready

uv package manager

Already installed (domus-api uses it)

✅ Ready

Terminal comfort

cd, ls, cat, pipes, git — daily tools

✅ Ready

domus-api understanding

You built it — you know FastAPI, Pydantic, pytest

✅ Ready

Python language knowledge

This is what the project teaches

❌ Learning in progress

Design Decisions

Decision Rationale

Python first, Go later

Learn one language properly. domus-api is Python. Integration is immediate. Go comes after the concept is proven.

YAML for persistence

Git-versioned, human-readable, editable in vim. Same pattern as antora.yml.

Bidirectional by default

Every associate() writes forward and reverse. No orphan references. Kernighan’s point: the lookup must work in both directions.

File-backed, not database

Same philosophy as domus-api. The filesystem is the database. YAML files are the records. pathlib.rglob("*.yml") is the query engine.

CLI + API, not GUI

Terminal-first. The API is for domus-api integration and mobile access over Tailscale. No frontend needed.

Field Value

PRJ ID

PRJ-2026-04-association-engine

Author

Evan Rosado

Created

2026-04-07

Updated

2026-04-07

Status

Draft

Category

Software / Data Structures / Knowledge Engineering

Priority

P1

Stack

Python, YAML, Typer (CLI), FastAPI (API integration)

Repository

~/atelier/_projects/personal/association-engine/

Inspiration

Brian Kernighan — associative arrays as a foundational concept (AWK, Go)