Phase 8: Seed Data
Phase 8: Seed Data
Objective
Populate the association graph with real relationships from your domus-captures knowledge base. This is where the engine becomes useful — not as a programming exercise, but as a tool that reveals connections across your projects, certifications, skills, and infrastructure.
Steps
1. Organize by domain
Create one YAML file per domain in the data/ directory:
data/
├── certifications.yml # CISSP, CCNP, and what they cover
├── projects.yml # domus-* projects and their dependencies
├── skills.yml # Tools, languages, and what uses them
└── infrastructure.yml # Network components and their relations
Each file is independent.
The load_directory() method merges them at runtime.
This means you can add, remove, or reorganize files without changing code.
2. Certifications
Create data/certifications.yml:
associations:
# --- CISSP domains ---
- source: CISSP
relation: covers
target: security-risk-management
- source: CISSP
relation: covers
target: asset-security
- source: CISSP
relation: covers
target: security-architecture
- source: CISSP
relation: covers
target: communication-network-security
- source: CISSP
relation: covers
target: identity-access-management
- source: CISSP
relation: covers
target: security-assessment-testing
- source: CISSP
relation: covers
target: security-operations
- source: CISSP
relation: covers
target: software-development-security
# --- CCNP ---
- source: CCNP
relation: covers
target: routing
- source: CCNP
relation: covers
target: switching
- source: CCNP
relation: covers
target: BGP
- source: CCNP
relation: covers
target: OSPF
- source: CCNP
relation: covers
target: network-automation
# --- Cross-certification links ---
- source: CISSP
relation: relates-to
target: CCNP
- source: communication-network-security
relation: relates-to
target: routing
- source: communication-network-security
relation: relates-to
target: switching
3. Projects
Create data/projects.yml:
associations:
# --- association-engine ---
- source: association-engine
relation: uses
target: Python
- source: association-engine
relation: uses
target: FastAPI
- source: association-engine
relation: uses
target: typer
- source: association-engine
relation: uses
target: PyYAML
- source: association-engine
relation: uses
target: pytest
- source: association-engine
relation: teaches
target: classes
- source: association-engine
relation: teaches
target: dicts
- source: association-engine
relation: teaches
target: testing
# --- domus-api ---
- source: domus-api
relation: uses
target: Python
- source: domus-api
relation: uses
target: FastAPI
- source: domus-api
relation: relates-to
target: association-engine
# --- domus-captures ---
- source: domus-captures
relation: uses
target: AsciiDoc
- source: domus-captures
relation: uses
target: Antora
- source: domus-captures
relation: relates-to
target: association-engine
# --- domus-infra-ops ---
- source: domus-infra-ops
relation: uses
target: Ansible
- source: domus-infra-ops
relation: uses
target: Vault
- source: domus-infra-ops
relation: covers
target: 802.1X
- source: domus-infra-ops
relation: covers
target: DNS
- source: domus-infra-ops
relation: covers
target: PKI
4. Skills and tools
Create data/skills.yml:
associations:
# --- Languages ---
- source: Python
relation: uses
target: pip
- source: Python
relation: uses
target: uv
- source: Python
relation: uses
target: pytest
- source: Python
relation: uses
target: ruff
# --- CLI tools ---
- source: awk
relation: relates-to
target: sed
- source: awk
relation: relates-to
target: grep
- source: jq
relation: relates-to
target: awk
- source: jq
relation: covers
target: JSON
# --- Frameworks ---
- source: FastAPI
relation: uses
target: Pydantic
- source: FastAPI
relation: uses
target: uvicorn
- source: Antora
relation: uses
target: AsciiDoc
- source: Antora
relation: uses
target: Node.js
# --- Security tools ---
- source: Vault
relation: covers
target: PKI
- source: Vault
relation: covers
target: secrets-management
- source: ISE
relation: covers
target: 802.1X
- source: ISE
relation: covers
target: RADIUS
- source: ISE
relation: relates-to
target: Active-Directory
5. Infrastructure
Create data/infrastructure.yml:
associations:
# --- Network layers ---
- source: 802.1X
relation: requires
target: RADIUS
- source: 802.1X
relation: requires
target: PKI
- source: 802.1X
relation: requires
target: Active-Directory
- source: RADIUS
relation: uses
target: ISE
- source: PKI
relation: uses
target: Vault
# --- DNS ---
- source: DNS
relation: uses
target: BIND
- source: DNS
relation: requires
target: Active-Directory
# --- Services ---
- source: Active-Directory
relation: covers
target: Kerberos
- source: Active-Directory
relation: covers
target: LDAP
- source: Active-Directory
relation: covers
target: DNS
6. Query for hidden connections
Now run queries and look for relationships you did not explicitly think about:
# What does 802.1X require?
uv run assoc query 802.1X | jq
# What is PKI used by? (reverse query)
uv run assoc reverse PKI | jq
# What connects to Active-Directory?
uv run assoc reverse Active-Directory | jq
# How many entities are in the graph?
uv run assoc list | wc -l
# What relations exist?
uv run assoc relations
# The interesting query: what does CISSP share with your infrastructure?
# CISSP covers communication-network-security, which relates-to routing
# routing is covered-by CCNP
# This chain: CISSP → network-security → routing → CCNP → network-automation
The graph does not traverse chains automatically yet — that would be a graph traversal algorithm (BFS/DFS), a future enhancement. But even single-hop queries reveal connections: PKI appears in certifications (CISSP domain), projects (domus-infra-ops), tools (Vault), and infrastructure (802.1X requirement). That convergence is the point.
7. Validate the data
# Count total associations across all files
awk '/source:/' data/*.yml | wc -l
# Find any entities that appear only once (potential orphans)
uv run assoc list | while read -r entity; do
fwd=$(uv run assoc query "$entity" 2>/dev/null | jq 'length')
rev=$(uv run assoc reverse "$entity" 2>/dev/null | jq 'length')
[ "$fwd" = "0" ] && [ "$rev" = "0" ] && echo "ORPHAN: $entity"
done
Checklist
-
data/certifications.ymlcreated with CISSP and CCNP domains -
data/projects.ymlcreated with domus-* project relationships -
data/skills.ymlcreated with tools and language relationships -
data/infrastructure.ymlcreated with network component relationships -
uv run assoc listshows all entities -
uv run assoc query CISSPreturns 8 domains -
uv run assoc reverse PKIreveals multiple sources -
No orphan entities (every entity has at least one connection)
-
All tests still pass
Verification
# Data loads without errors
uv run assoc list | head -5
# Specific queries work
uv run assoc query 802.1X | jq -e '.requires | length > 0'
# Tests unaffected
uv run pytest tests/ -v --tb=short
The graph now holds real knowledge. Every project, certification, tool, and infrastructure component is connected. Phase 9 outlines the future Go port — after the Python version has proven its value.