Phase 0: Toolchain

Phase 0: Toolchain

Install, configure, and validate every tool in the forensics stack. Each tool gets a verification test and a codex entry.

Installation

All tools installed on Arch Linux via pacman unless noted.

# Core — document + image + forensics
sudo pacman -S poppler tesseract tesseract-data-eng imagemagick \
  perl-image-exiftool binwalk testdisk

# Sleuth Kit (EnCase CLI replacement)
sudo pacman -S sleuthkit

# Network forensics
sudo pacman -S wireshark-cli tcpdump ngrep

# Deduplication
sudo pacman -S fdupes

# AUR
yay -S hashdeep

Installed & Verified (2026-04-19)

Tool Version Status

exiftool

13.50

tesseract

5.5.2

identify (ImageMagick)

7.1.2-18

pdftotext (poppler)

26.03.0

strings (binutils)

2.46

binwalk

installed

hashdeep

4.4

fdupes

2.4.0

photorec / testdisk

7.2

mmls (Sleuth Kit)

4.14.0

fls (Sleuth Kit)

4.14.0

icat (Sleuth Kit)

4.14.0

istat (Sleuth Kit)

4.14.0

mactime (Sleuth Kit)

4.14.0

fsstat (Sleuth Kit)

4.14.0

img_stat (Sleuth Kit)

4.14.0

sorter (Sleuth Kit)

4.14.0

sigfind (Sleuth Kit)

4.14.0

tshark (Wireshark)

4.6.4

tcpdump

4.99.6

ngrep

installed

Not Yet Installed (AUR / Source)

Tool Purpose Install

dc3dd

Forensic dd with hashing

AUR

libewf

E01 (EnCase format) imaging

AUR

scalpel

Configurable file carving

AUR

foremost

Header-based file carving

AUR

bulk_extractor

Artifact extraction (emails, URLs)

AUR

findimagedupes

Perceptual hash dedup

AUR

volatility3

Memory forensics

AUR / pip

tcpflow

TCP session reconstruction

AUR

zsteg

PNG steganography detection

gem install zsteg

Document Tools

Tool Purpose Package

pdftotext

Extract text from PDF

poppler

pdfinfo

PDF metadata inspection

poppler

pdfimages

Extract embedded images from PDF

poppler

tesseract

OCR — image to text

tesseract, tesseract-data-eng

pandoc

Format conversion (docx, epub, html, markdown)

pandoc

djvutxt

DjVu text extraction

djvulibre

antiword

Legacy .doc extraction

antiword

Image & Metadata Tools

Tool Purpose Package

exiftool

EXIF/IPTC/XMP metadata (read/write/strip)

perl-image-exiftool

identify

Image properties (dimensions, depth, format)

imagemagick

convert

Image transformation, format conversion

imagemagick

fdupes / jdupes

Exact duplicate detection

fdupes / jdupes

findimagedupes

Perceptual hash deduplication

findimagedupes

zsteg

PNG/BMP steganography detection

gem install zsteg

stegdetect

JPEG steganography detection

AUR / source

Disk & Filesystem Forensics (EnCase CLI Equivalents)

Tool Purpose Package

dd / dc3dd

Bit-for-bit disk imaging with hashing

coreutils / dc3dd

ewfacquire

Create E01 (EnCase format) forensic images

libewf

ewfinfo / ewfverify

Inspect and verify E01 images

libewf

mmls

Partition table analysis

sleuthkit

fsstat

Filesystem metadata

sleuthkit

fls

List files and directories (including deleted)

sleuthkit

icat

Extract file by inode number

sleuthkit

istat

Inode metadata inspection

sleuthkit

mactime

Timeline generation from body file

sleuthkit

blkcat

Extract raw data blocks

sleuthkit

img_stat

Disk image metadata

sleuthkit

sigfind

Signature-based searching in images

sleuthkit

sorter

Categorize files by type in disk image

sleuthkit

hfind

Hash database lookup (NSRL, known-bad)

sleuthkit

photorec

File carving from raw disk/image

testdisk

scalpel

Configurable file carving

scalpel

foremost

Header-based file carving

foremost

bulk_extractor

Extract emails, URLs, credit cards from images

bulk_extractor

Memory Forensics

Tool Purpose Package

volatility3

Memory dump analysis (processes, network, registry)

python-volatility3

strings

Extract printable strings from binary/memory

binutils

xxd

Hex dump with ASCII

vim (ships with)

hexdump

Hex/octal/decimal dump

util-linux

binwalk

Firmware and embedded file extraction

binwalk

Network Forensics

Tool Purpose Package

tshark

CLI packet analysis (Wireshark engine)

wireshark-cli

tcpdump

Packet capture and filtering

tcpdump

ngrep

Network grep — pattern match on packets

ngrep

tcpflow

Reconstruct TCP sessions from pcap

tcpflow

NetworkMiner

Network forensic analysis (runs on Mono)

source / AUR

Integrity & Hashing

Tool Purpose Package

sha256sum / sha512sum

Cryptographic hash verification

coreutils

b2sum

BLAKE2 hashing (faster, equally secure)

coreutils

md5sum

MD5 hashing (legacy, court-accepted)

coreutils

aide

File integrity monitoring (tripwire alternative)

aide

hashdeep

Recursive hashing with audit mode

hashdeep

Validation Test: Personal Photo (2026-04-19)

First live test: 20260403_185029.jpg — personal photo emailed from Galaxy Z Fold7.

Tools Used

  1. exiftool -a -G1 — full grouped EXIF/IPTC/XMP/ICC/Samsung/MPF extraction

  2. binwalk — embedded file signature scan

  3. strings -n 12 — printable string extraction from binary

  4. identify -verbose — ImageMagick pixel-level analysis

Context Extracted

Layer Findings

Device

Samsung Galaxy Z Fold7, firmware F966U1UES8AZC1

Timestamp

2026-04-03 18:50:29.622 PDT (UTC-07:00), sub-second precision

Camera settings

f/2.2, 1/60s, ISO 1000, 2.2mm (23mm equiv), 1.66x digital zoom, no flash — confirms indoor low-light

Carrier

MCC 310 — United States / Guam

GPS

Not present — either stripped by Outlook or disabled on device

Color pipeline

DCI-P3 D65 gamut with sRGB transfer, Samsung ICC profile (2022-07-01)

HDR

Dual-image MPF container — primary JPEG (1.4MB) + gain map (51KB) per ISO 21496-1

Image sensor

IMX564 (Sony sensor, extracted via strings) with calibration ref 2502171.N.PA2

Samsung internals

Unique ID S12XSRJ00NM, SSCAL calibration string, device serial fragment 0c57623f073a

Pixel statistics

12MP (4000x3000), 8-bit sRGB, mean luminance 36% — dark scene, warm color cast (red channel dominant)

Steganography

binwalk: 0 embedded signatures beyond expected JPEG/EXIF/MPF structure — clean

Embedded data

512x384 thumbnail (54KB), XMP with Adobe Core 5.1.2, Google GContainer HDR metadata

Forensic Significance

  • No GPS but carrier country is exposed — MCC 310 narrows to US even without coordinates

  • Sony IMX564 sensor ID + calibration string — ties this image to a specific sensor lot, potentially traceable to manufacturing batch

  • Device serial fragment (0c57623f073a) — partial MAC or hardware identifier embedded by Samsung

  • Firmware build (F966U1UES8AZC1) — exact software version, checkable against CVE databases

  • Sub-second timestamp — 622ms precision, useful for timeline correlation

  • Unique Image IDS12XSRJ00NM persists across copies, forwards, uploads — the photo is forever linked to this device