File Hunting

Overview

File hunting is the art of finding content across a codebase. It combines:

find - locate files by name/type/time
grep - search content within files
awk - format and process output

This is the workflow you’ll use 10x daily as a senior engineer.

The Trifecta Pattern

Pattern 1: grep -rn (Quick Content Search)

Fastest for simple searches:

# Search for term, show file:line:content
grep -rn "borg_backups" ~/atelier/_bibliotheca/domus-* --include="*.adoc"

# Breakdown:
# -r   = recursive
# -n   = show line numbers
# --include = filter by extension

Pattern 2: find + grep -l (List Files Only)

When you just need filenames:

# Find files containing term
find ~/atelier/_bibliotheca -name "*.adoc" -exec grep -l "borg_backups" {} \;

# Breakdown:
# -exec grep -l {} \;   = run grep on each file, print filename if match
# -l                    = list filenames only (not content)

Pattern 3: find + grep + awk (Formatted Output)

The power pattern - formatted columns:

# Find, grep with context, format with awk
find ~/atelier/_bibliotheca -name "*.adoc" -exec grep -Hn "borg_backups" {} \; | \
  awk -F: '{printf "%-60s L%-4s %s\n", $1, $2, $3}'

# Breakdown:
# grep -Hn         = -H (show filename), -n (show line number)
# awk -F:          = split on colon (grep output: file:line:content)
# printf "%-60s"   = left-align, 60 char width for filename
# L%-4s            = "L" prefix + 4 char width for line number

Output:

/home/user/atelier/.../borg-backup.adoc                      L48   sudo mount -t nfs nas-01...
/home/user/atelier/.../backup-strategy.adoc                  L183  sudo mount -t nfs nas-01...

Real-World Hunting Scenarios

Find Where a Command Is Documented

# Where is the NFS mount command documented?
grep -rn "mount -t nfs" ~/atelier/_bibliotheca/domus-* --include="*.adoc" | head -20

# Just unique files
grep -rl "mount -t nfs" ~/atelier/_bibliotheca/domus-* --include="*.adoc" | sort -u

Find All References to an IP Address

# Find hardcoded IPs (should be attributes!)
grep -rn "10\.50\.1\.70" ~/atelier/_bibliotheca/domus-* --include="*.adoc"

# Count occurrences per file
grep -rc "10\.50\.1\.70" ~/atelier/_bibliotheca/domus-* --include="*.adoc" | \
  awk -F: '$2 > 0 {print}'

Find Cross-Component xrefs (Potential Issues)

# Find xrefs with double-colon (cross-component)
grep -rn 'xref:[a-z-]*::' ~/atelier/_bibliotheca/domus-* --include="*.adoc" | \
  awk -F: '{printf "%-50s L%-4s %s\n", $1, $2, $3}'

Find Files Modified Today

# Combine time filter with content search
find ~/atelier/_bibliotheca/domus-* -name "*.adoc" -mtime 0 -exec grep -l "TODO" {} \;

# Show modification times
find ~/atelier/_bibliotheca/domus-* -name "*.adoc" -mtime 0 -printf "%T+ %p\n" | sort -r

Find Large Files by Type

# Large .adoc files (> 50KB)
find ~/atelier/_bibliotheca -name "*.adoc" -size +50k -exec ls -lh {} \; | \
  awk '{print $5, $NF}'

# Count lines in large files
find ~/atelier/_bibliotheca -name "*.adoc" -size +50k -exec wc -l {} \; | sort -rn

awk Formatting Cheatsheet

Field Splitting

# grep output is: file:line:content
# -F: splits on colon

awk -F: '{print $1}'          # filename
awk -F: '{print $2}'          # line number
awk -F: '{print $3}'          # content (first part only!)

# Problem: content may have colons!
# Solution: print everything from field 3 onwards
awk -F: '{$1=$2=""; print $0}'

printf Width Specifiers

# %-Ns = left-align, N characters wide
printf "%-60s"    # filename, 60 chars, left-aligned
printf "%60s"     # filename, 60 chars, right-aligned
printf "%-4s"     # line number, 4 chars

# Common pattern
awk -F: '{printf "%-60s L%-4s %s\n", $1, $2, $3}'

Filtering with awk

# Only print if field 2 (count) > 0
awk -F: '$2 > 0 {print}'

# Only print .adoc files
awk -F: '/\.adoc/ {print}'

# Skip certain paths
awk -F: '!/node_modules/ && !/build/ {print}'

Building a Hunt Alias

Add to ~/.zshrc:

# Hunt for content across domus repos
hunt() {
  local term="$1"
  local ext="${2:-adoc}"  # default to .adoc
  find ~/atelier/_bibliotheca/domus-* -name "*.$ext" -exec grep -Hn "$term" {} \; | \
    awk -F: '{printf "%-60s L%-4s %s\n", $1, $2, substr($0, index($0,$3))}'
}

# Usage:
# hunt "borg_backups"
# hunt "10.50.1.70"
# hunt "mount -t nfs" sh

Performance: grep -r vs find + grep

Method Pros Cons

Method	Pros	Cons
`grep -rn --include`	Fastest for simple searches, built-in extension filter	No time/size filters, less control
`find … -exec grep`	Full find power (time, size, type), combine with any filter	Slower (forks grep per file), more verbose
`find … \| xargs grep`	Faster than `-exec` (batches files), handles spaces with `-print0`	More complex syntax
`ripgrep (rg)`	Fastest, respects .gitignore, modern	External tool, not always installed

grep -rn --include

Fastest for simple searches, built-in extension filter

No time/size filters, less control

find … -exec grep

Full find power (time, size, type), combine with any filter

Slower (forks grep per file), more verbose

find … | xargs grep

Faster than -exec (batches files), handles spaces with -print0

More complex syntax

ripgrep (rg)

Fastest, respects .gitignore, modern

External tool, not always installed

When to Use Each

# Quick content search → grep -rn
grep -rn "pattern" dir --include="*.py"

# Need time/size filters → find + grep
find . -name "*.py" -mtime -7 -exec grep -l "pattern" {} \;

# Large codebase, speed critical → ripgrep
rg "pattern" --type py

Common Mistakes

Mistake 1: Forgetting Quotes

# WRONG - shell expands *.adoc before find runs
find . -name *.adoc

# RIGHT - quotes prevent expansion
find . -name "*.adoc"

Mistake 2: grep Eats Your Colons

# grep output: /path/to/file:123:content:with:colons
# awk -F: only gets first 3 fields!

# Solution: Reconstruct from field 3 onwards
awk -F: '{
  file=$1; line=$2
  $1=$2=""
  content=$0
  printf "%-50s L%-4s %s\n", file, line, content
}'

Mistake 3: Wrong Directory

# You're in domus-captures but file is in domus-infra-ops
find . -name "backup-strategy*"   # Returns nothing!

# Solution: Search from bibliotheca root
find ~/atelier/_bibliotheca -name "backup-strategy*"

Find and Open (Interactive Hunting)

Open First Match in nvim

# Command substitution (opens all matches)
nvim $(find ~/atelier/_bibliotheca -name "file-hunting.adoc")

# First match only (-print -quit stops after finding one)
nvim $(find ~/atelier/_bibliotheca -name "backup-strategy*" -print -quit)

# xargs (handles spaces in paths safely)
find ~/atelier/_bibliotheca -name "*.adoc" -print0 | xargs -0 nvim

# -exec directly
find ~/atelier/_bibliotheca -name "nav.adoc" -exec nvim {} \;

Conditional Open (Only If Found)

# Store result, check if empty, then open
file=$(find ~/atelier/_bibliotheca -name "backup-strategy*" -print -quit) && \
  [[ -n "$file" ]] && nvim "$file" || echo "Not found"

Shell Function for Daily Use