PRJ: Data Forensics

Project Summary

CLI-first data extraction and forensic analysis toolkit. Extract truth from opaque data — documents, photos, filesystems, memory, network captures — using terminal tools exclusively. Spans both work (incident response, evidence handling) and personal (photo library audit, system integrity, metadata hygiene).

The unifying principle: everything you touch produces data you should be able to interrogate from the CLI.

Status

Phase Description Status Notes

0: Toolchain

Install, configure, validate all tools

✅ Done

21 tools verified, live test on personal photo completed 2026-04-19

1: Document Pipeline

PDF extraction, OCR, batch processing

❌ Not started

pdftotext, tesseract, pandoc

2: Image Pipeline

Metadata extraction, dedup, geo-audit

❌ Not started

exiftool, ImageMagick, fdupes

3: System & Disk Forensics

TSK, memory forensics, timeline analysis

❌ Not started

EnCase-equivalent CLI workflows

4: Network Forensics

Packet capture, protocol analysis, carving

❌ Not started

tshark, tcpdump, ngrep

5: Automation

Shell pipelines, scheduled audits, batch jobs

❌ Not started

find + xargs + awk chains

6: AI Integration

Local model ingestion, classification, summarization

❌ Not started

Ollama + extracted data

Field Value

PRJ ID

PRJ-2026-04-data-forensics

Author

Evan Rosado

Created

2026-04-19

Updated

2026-04-19

Phase

0 Complete — Toolchain verified

Status

Active

Category

Data Analysis / Digital Forensics

Priority

P1 - High

Scope

Universal (Work + Personal)