Phase 1: Classes & Dicts

Phase 1: Your First Class and Dicts

Objective

Build the empty skeleton of AssociationGraph — a class that holds two dictionaries: one for forward associations (A relates-to B) and one for reverse associations (B is-related-by A). No logic yet. Just the container and the proof it works.

Python Concepts

Concept Plain English

class

A blueprint for creating objects. Think of it as an AsciiDoc template: the template defines structure, and each page you generate from it is an instance. AssociationGraph is the template; each time you write AssociationGraph() you stamp out a new, independent graph.

self

A reference to this particular instance of the class. When you have two graphs, graph_a and graph_b, self inside a method tells Python which one you are operating on. Analogy: $0 in awk refers to the current record. self refers to the current object.

__init__

The constructor method. Python calls it automatically when you create an instance with AssociationGraph(). It sets up the initial state — here, two empty dicts. Every class you write will have one.

dict

Python’s associative array — the same concept Kernighan described in The AWK Programming Language. A dict maps keys to values: {"CISSP": ["security", "certification"]}. In jq terms, it is a JSON object. In awk terms, it is an associative array.

Type hints

Annotations that document what a variable holds. dict[str, dict[str, list[str]]] says: "a dict whose keys are strings, whose values are dicts, whose keys are strings, whose values are lists of strings." Python does not enforce these at runtime — they are documentation that tools like ruff and mypy can check.

list

An ordered, mutable sequence. ["a", "b", "c"] is a list. Unlike a set, a list preserves insertion order and allows duplicates.

Nested dicts

A dict inside a dict. self._forward["CISSP"]["covers"] navigates two levels: first to the key "CISSP", then to the relation "covers", which yields a list of targets. This is exactly jq '.CISSP.covers' on a JSON object.

Steps

1. Understand the data shape

Before writing code, visualize the structure. The forward dict maps an entity to its relations and targets:

{
  "CISSP": {
    "covers": ["access-control", "cryptography"],
    "requires": ["5-years-experience"]
  }
}

The reverse dict holds the inverse view:

{
  "access-control": {
    "covered-by": ["CISSP"]
  }
}

Two dicts. Same data. Different directions of traversal.

2. Write the class

Open src/association_engine/graph.py and write:

"""Bidirectional association graph."""


class AssociationGraph:
    """A graph that stores entities, relations, and their inverses.

    Internally this is two nested dicts:
      _forward:  {source: {relation: [targets]}}
      _reverse:  {target: {inverse_relation: [sources]}}
    """

    def __init__(self) -> None:
        # Forward: source -> relation -> [targets]
        self._forward: dict[str, dict[str, list[str]]] = {}

        # Reverse: target -> inverse_relation -> [sources]
        self._reverse: dict[str, dict[str, list[str]]] = {}

Walk through this line by line:

  1. class AssociationGraph: — declares the class. The colon starts an indented block (like { in C or awk).

  2. def __init__(self) → None: — the constructor. def defines a function. self is always the first argument of a method. → None is a type hint saying this method returns nothing.

  3. self._forward — the underscore prefix is a Python convention meaning "private — don’t touch this from outside the class." It is not enforced, just a signal.

  4. dict[str, dict[str, list[str]]] — the type hint. Read it inside-out: a list of strings, inside a dict keyed by strings, inside another dict keyed by strings.

3. Export the class from the package

Open src/association_engine/__init__.py and write:

"""Association Engine — bidirectional knowledge graph."""

from association_engine.graph import AssociationGraph

__all__ = ["AssociationGraph"]

from …​ import …​ is how Python pulls a name from another module. __all__ declares the public API — when someone writes from association_engine import *, only AssociationGraph is exported.

4. Write the first real test

Open tests/test_graph.py and replace the smoke test:

"""Tests for AssociationGraph — Phase 1."""

from association_engine.graph import AssociationGraph


class TestInit:
    """Verify the constructor creates an empty graph."""

    def test_forward_starts_empty(self) -> None:
        graph = AssociationGraph()
        assert graph._forward == {}

    def test_reverse_starts_empty(self) -> None:
        graph = AssociationGraph()
        assert graph._reverse == {}

    def test_two_instances_are_independent(self) -> None:
        a = AssociationGraph()
        b = AssociationGraph()
        a._forward["test"] = {}
        assert "test" not in b._forward

Three tests:

  1. Forward dict starts empty.

  2. Reverse dict starts empty.

  3. Two instances do not share state (a common beginner trap with mutable default arguments — you avoided it by assigning in __init__).

5. Run the tests

uv run pytest tests/ -v

Expected:

tests/test_graph.py::TestInit::test_forward_starts_empty PASSED
tests/test_graph.py::TestInit::test_reverse_starts_empty PASSED
tests/test_graph.py::TestInit::test_two_instances_are_independent PASSED

6. Lint

uv run ruff check src/ tests/

Fix anything ruff reports before moving on. Common first-time issues: missing trailing newline, unused imports.

Checklist

  • AssociationGraph class defined in graph.py

  • __init__ creates empty _forward and _reverse dicts

  • Type hints on both dicts

  • __init__.py exports AssociationGraph

  • Three tests in test_graph.py

  • uv run pytest tests/ -v — all pass

  • uv run ruff check src/ tests/ — clean

Verification

uv run pytest tests/ -v --tb=short
uv run ruff check src/ tests/

Both must exit cleanly. You now have a class that does nothing useful — but it exists, it is tested, and the toolchain proves it. Phase 2 gives it behavior.