Linux Search Mastery - The Complete Arsenal

"Good. Use your aggressive feelings. Let the search flow through you."

 — Emperor Palpatine (on finding any file, anywhere, instantly)


Philosophy: Why Master Traditional Tools

The Harsh Reality

You will encounter servers where:

  • ❌ No ripgrep

  • ❌ No fd

  • ❌ No fzf

  • ❌ No ag

  • ❌ Minimal packages

  • ❌ Cannot install anything

  • ❌ Production restrictions

  • ❌ Network isolated

  • ❌ Emergency response mode

You will ONLY have:

  • grep (maybe egrep)

  • find

  • awk

  • sed

  • ✅ Basic shell builtins

  • ✅ Your knowledge

The Career Moment

Bad scenario:

# Network outage, SSH to production RHEL 7 server
# CTO watching your screen share
$ rg "error" /var/log/
-bash: rg: command not found
$ fd "*.log"
-bash: fd: command not found
# Fumble with grep syntax
$ grep "error" /var/log/
grep: /var/log/: Is a directory
# More fumbling, losing credibility
# "Why don't you know basic Linux commands?"

Good scenario:

# Same situation
# Immediately type with confidence:
$ grep -r "error" /var/log/ 2>/dev/null | head -20
# Then refine:
$ find /var/log -type f -name "*.log" -mtime -1 -exec grep -l "error" {} \;
# Directors impressed:
# "How did you find that so fast?"
# You: "Standard tools. Works anywhere."

The Dual Path Strategy

Primary tools (modern workstation):

# Your Arch laptop, fully equipped
rg "pattern"
fd "*.py"
fzf

Fallback tools (guaranteed everywhere):

# Any production server, minimal install
grep -r "pattern" .
find . -name "*.py"
# Add fzf-like with awk/sed if needed

The mindset: "I PREFER modern tools, but I CAN SURVIVE with traditional ones."

Learning Order for This Document

Phase 1 (70% of effort): Traditional Mastery

  • grep → find → xargs → awk/sed

  • Learn these COLD

  • Muscle memory, no hesitation

  • Works on ANY Linux box

Phase 2 (30% of effort): Modern Superpowers

  • ripgrep → fd → fzf

  • Understand when to use

  • Build fallbacks into aliases

  • Impress on workstations

Result: Competent anywhere, powerful when possible.


The grep Family - Pattern Matching Mastery

What is grep?

grep = Global Regular Expression Print

Searches text for patterns and prints matching lines.

The family:

  • grep - Basic Regular Expressions (BRE)

  • egrep or grep -E - Extended Regular Expressions (ERE)

  • fgrep or grep -F - Fixed strings (no regex)

  • pgrep - Process grep (searches process names)

  • zgrep - Compressed files (.gz)

  • bzgrep - Bzip2 files (.bz2)

  • xzgrep - XZ files (.xz)

Basic grep Syntax

grep [OPTIONS] PATTERN [FILE...]

Essential grep Flags (Memorize These)

# Case insensitive
grep -i "pattern" file.txt
# Recursive search (directories)
grep -r "pattern" /path/
# Recursive + follow symlinks
grep -R "pattern" /path/
# Show line numbers
grep -n "pattern" file.txt
# Show only filenames with matches
grep -l "pattern" *.txt
# Show only filenames WITHOUT matches
grep -L "pattern" *.txt
# Count matching lines
grep -c "pattern" file.txt
# Invert match (lines NOT matching)
grep -v "pattern" file.txt
# Show context (3 lines before and after)
grep -C 3 "pattern" file.txt
# Context before only
grep -B 3 "pattern" file.txt
# Context after only
grep -A 3 "pattern" file.txt
# Extended regex (ERE)
grep -E "pattern1|pattern2" file.txt
# Fixed string (literal, no regex)
grep -F "literal.string" file.txt
# Whole word match only
grep -w "word" file.txt
# Multiple patterns
grep -e "pattern1" -e "pattern2" file.txt
# Pattern from file
grep -f patterns.txt file.txt
# Quiet mode (exit code only, no output)
grep -q "pattern" file.txt && echo "Found"
# Max count (stop after N matches)
grep -m 10 "pattern" huge_file.log
# Suppress errors (permission denied, etc.)
grep -s "pattern" /protected/dir/*
# Binary files as text
grep -a "pattern" binary_file
# Show filename with matches (default for multiple files)
grep -H "pattern" file.txt
# Suppress filename
grep -h "pattern" *.txt
# Color output (auto = only if terminal)
grep --color=auto "pattern" file.txt
# Color always (even when piped)
grep --color=always "pattern" file.txt | less -R

Critical grep Patterns

1. Search Current Directory Recursively

# Most common pattern
grep -r "error" .
# With line numbers
grep -rn "error" .
# Case insensitive
grep -ri "error" .
# With context
grep -rC 3 "error" .
# Suppress permission errors
grep -r "error" . 2>/dev/null
# Only certain file types
grep -r --include="*.log" "error" .
# Exclude certain files
grep -r --exclude="*.min.js" "function" .
# Exclude directories
grep -r --exclude-dir=".git" "pattern" .

2. Search Multiple Patterns (OR logic)

# Method 1: Extended regex with |
grep -E "error|warning|critical" file.log
# Method 2: Multiple -e flags
grep -e "error" -e "warning" -e "critical" file.log
# Method 3: Pattern file
cat > patterns.txt << 'EOF'
error
warning
critical
EOF
grep -f patterns.txt file.log

3. Search for Whole Words

# WRONG: Matches "error", "errors", "terrorist"
grep "error" file.txt
# RIGHT: Matches only "error" as whole word
grep -w "error" file.txt
# Example:
echo "This is an error in errors" | grep -w "error"

4. Count Occurrences

# Count matching lines
grep -c "error" file.log
# Count total matches (not just lines)
grep -o "error" file.log | wc -l
# Count per file
grep -c "error" *.log

5. Find Files Containing Pattern

# List filenames only
grep -rl "TODO" .
# With line numbers (filename:line:content)
grep -rn "TODO" .
# Count per file
grep -rc "TODO" . | grep -v ":0$"

6. Inverted Match (Exclude Lines)

# Show lines NOT containing "DEBUG"
grep -v "DEBUG" app.log
# Exclude multiple patterns
grep -v "DEBUG" app.log | grep -v "INFO"
# Or use extended regex
grep -Ev "DEBUG|INFO" app.log

7. Context Around Matches

# 3 lines before and after
grep -C 3 "error" app.log
# 5 lines before
grep -B 5 "Exception" app.log
# 2 lines after
grep -A 2 "Started" app.log
# Combine (2 before, 5 after)
grep -B 2 -A 5 "WARNING" app.log

Real-World grep Examples

Security: Find Potential Credentials

# Search for API keys
grep -rE "api[_-]?key\s*=\s*['\"]?[a-zA-Z0-9]{20,}" .
# Search for AWS keys
grep -rE "AKIA[0-9A-Z]{16}" .
# Search for passwords in config
grep -ri "password\s*=" . --include="*.conf" --include="*.ini"
# Search for JWT tokens
grep -rE "eyJ[A-Za-z0-9-_=]+\." .
# Search for SSH private keys
grep -rl "BEGIN.*PRIVATE KEY" /home/
# Exclude false positives
grep -r "password" . --exclude-dir=".git" --exclude="*.md"

Incident Response: Find Suspicious Activity

# Failed SSH login attempts
grep "Failed password" /var/log/auth.log | wc -l
# Successful logins with context
grep -A 3 "Accepted password" /var/log/auth.log
# Privilege escalation (sudo usage)
grep "sudo" /var/log/auth.log | grep -v "COMMAND"
# Recently modified files in /etc
find /etc -type f -mtime -1 -exec grep -l "pattern" {} \;
# Suspicious cron jobs
grep -r "curl.*sh" /var/spool/cron/
grep -r "wget.*sh" /etc/cron*

Log Analysis: Extract Specific Events

# Apache errors in last hour
grep "$(date -d '1 hour ago' '+%d/%b/%Y:%H')" /var/log/apache2/error.log
# Nginx 404s
grep " 404 " /var/log/nginx/access.log
# Extract IPs from access log
grep -oE "\b([0-9]{1,3}\.){3}[0-9]{1,3}\b" access.log | sort -u
# Failed database connections
grep -i "connection.*fail" /var/log/mysql/error.log
# Application exceptions with stack trace
grep -A 50 "Exception" /var/log/app.log

Code Search: Find Function Definitions

# Find Python function definitions
grep -rn "^def " --include="*.py" .
# Find JavaScript functions
grep -rn "function\s\+\w\+" --include="*.js" .
# Find TODO comments
grep -rn "TODO\|FIXME\|XXX" --include="*.py" --include="*.js"
# Find SQL queries
grep -rn "SELECT.*FROM" --include="*.py" .
# Find hardcoded IPs
grep -rE "\b([0-9]{1,3}\.){3}[0-9]{1,3}\b" --include="*.conf" .

grep Performance Tips

# Use -F for literal strings (faster)
grep -F "exact.string.no.regex" huge_file.log
# Use -m to stop after N matches
grep -m 1 "pattern" huge_file.log
# Binary files: skip with -I
grep -rI "pattern" .
# Or treat as text with -a
grep -a "pattern" binary_file
# Limit to specific file types
grep -r --include="*.log" "pattern" /var/log/
# Exclude binary files automatically
grep -rI "pattern" .

Common grep Mistakes

# ❌ WRONG: Not escaping special characters
grep "192.168.1.1" file.txt
# ✅ RIGHT: Escape dots
grep "192\.168\.1\.1" file.txt
# ❌ WRONG: Forgot quotes with spaces
grep error message file.txt
# ✅ RIGHT: Quote the pattern
grep "error message" file.txt
# ❌ WRONG: Using cat unnecessarily
cat file.txt | grep "pattern"
# ✅ RIGHT: grep can read files directly
grep "pattern" file.txt
# ❌ WRONG: Recursive on file
grep -r "pattern" file.txt
# ✅ RIGHT: Just grep the file
grep "pattern" file.txt

The find Command - File Discovery Mastery

What is find?

find searches for files in a directory hierarchy based on various criteria.

Why find is critical:

  • Searches by filename, size, time, permissions, owner, type

  • Can execute commands on results

  • Works on any POSIX system

  • Foundation for xargs pipelines

Basic find Syntax

find [PATH...] [EXPRESSION]

EXPRESSION consists of:

  • Tests (return true/false): -name, -type, -size, -mtime

  • Actions: -print, -exec, -delete

  • Operators: !, -and, -or, (, )

Essential find Tests (Memorize These)

# Find by name (case-sensitive)
find . -name "*.log"
# Find by name (case-insensitive)
find . -iname "*.LOG"
# Find by type
find . -type f
find . -type d
find . -type l
find . -type b
find . -type c
find . -type p
find . -type s
# Find by size
find . -size +100M
find . -size -1k
find . -size 50M
find . -size +1G
# Find by modification time
find . -mtime -7
find . -mtime +30
find . -mtime 7
# Find by access time
find . -atime -1
# Find by change time (metadata)
find . -ctime -7
# Find by modification minute
find . -mmin -60
find . -mmin +120
# Find by user
find . -user root
find . -user $(whoami)
# Find by group
find . -group wheel
# Find by permissions (exact)
find . -perm 755
find . -perm u=rwx,g=rx,o=rx
# Find by permissions (at least)
find . -perm -u+w
find . -perm -g+w
# Find by permissions (any of)
find . -perm /u+w,g+w
# Find empty files/directories
find . -empty
# Find by inode
find . -inum 12345
# Find by number of links
find . -links +2
# Find newer than file
find . -newer reference_file.txt
# Find not newer than file
find . ! -newer old_file.txt

Essential find Actions

# Print results (default)
find . -name "*.txt" -print
# Print with null delimiter (for xargs -0)
find . -name "*.txt" -print0
# Custom print format
find . -printf "%p %s bytes\n"
find . -printf "%T+ %p\n"
# List files (like ls -l)
find . -name "*.sh" -ls
# Execute command on each result
find . -name "*.tmp" -exec rm {} \;
# Execute with confirmation
find . -name "*.tmp" -ok rm {} \;
# Execute command with all results at once
find . -name "*.txt" -exec grep "pattern" {} +
# Delete files (DANGEROUS)
find . -name "*.tmp" -delete

Critical find Patterns

1. Find Files Modified Recently

# Last 24 hours
find . -type f -mtime -1
# Last 7 days
find . -type f -mtime -7
# Last hour
find . -type f -mmin -60
# Between 7 and 14 days ago
find . -type f -mtime +7 -mtime -14
# Modified today
find . -type f -mtime 0

2. Find Large Files

# Larger than 100MB
find . -type f -size +100M
# Largest files in current directory
find . -type f -exec du -h {} + | sort -rh | head -20
# Or more efficiently:
find . -type f -printf "%s\t%p\n" | sort -rn | head -20
# Files between 10MB and 100MB
find . -type f -size +10M -size -100M

3. Find and Delete Files

# Find and delete .tmp files
find . -name "*.tmp" -type f -delete
# With confirmation
find . -name "*.tmp" -type f -ok rm {} \;
# Delete empty directories
find . -type d -empty -delete
# Delete old log files (30+ days)
find /var/log -name "*.log" -type f -mtime +30 -delete
# Delete core dumps
find . -name "core.*" -type f -delete

4. Find and Execute Commands

# Find and grep (method 1: slow)
find . -name "*.log" -exec grep "error" {} \;
# Find and grep (method 2: faster)
find . -name "*.log" -exec grep "error" {} +
# Find and grep (method 3: fastest with xargs)
find . -name "*.log" -print0 | xargs -0 grep "error"
# Find and change permissions
find . -name "*.sh" -type f -exec chmod +x {} \;
# Find and move files
find . -name "*.txt" -exec mv {} /backup/ \;
# Find and compress
find . -name "*.log" -type f -exec gzip {} \;

5. Find with Multiple Conditions (AND/OR/NOT)

# AND (both conditions must be true)
find . -name "*.log" -size +10M
# OR (either condition true)
find . \( -name "*.log" -o -name "*.txt" \)
# NOT (negation)
find . ! -name "*.log"
find . -not -name "*.log"
# Complex: .log files OR .txt files, but NOT in .git
find . \( -name "*.log" -o -name "*.txt" \) ! -path "*/.git/*"
# Find files owned by user but NOT in their home
find / -user $USER ! -path "/home/$USER/*" 2>/dev/null

6. Find with Depth Control

# Maximum depth (2 levels down)
find . -maxdepth 2 -name "*.txt"
# Minimum depth (skip current directory)
find . -mindepth 1 -name "*.txt"
# Exactly at depth 3
find . -mindepth 3 -maxdepth 3 -name "*.txt"
# Current directory only (not recursive)
find . -maxdepth 1 -name "*.txt"

7. Exclude Directories

# Skip .git directories
find . -path "*/.git" -prune -o -name "*.py" -print
# Skip multiple directories
find . \( -path "*/.git" -o -path "*/node_modules" \) -prune -o -name "*.js" -print
# Simpler with modern find
find . -name "*.py" ! -path "*/.git/*" ! -path "*/venv/*"

Real-World find Examples

Security: Find SUID/SGID Binaries

# Find SUID binaries (dangerous)
find / -perm -4000 -type f 2>/dev/null
# Find SGID binaries
find / -perm -2000 -type f 2>/dev/null
# Find both SUID and SGID
find / -perm /6000 -type f 2>/dev/null
# List with details
find / -perm /6000 -type f -ls 2>/dev/null
# Compare against known good list
find / -perm /6000 -type f 2>/dev/null > current_suid.txt
diff baseline_suid.txt current_suid.txt

Security: Find World-Writable Files

# World-writable files
find / -perm -002 -type f 2>/dev/null
# World-writable directories
find / -perm -002 -type d 2>/dev/null
# World-writable and NOT sticky bit (危险)
find / -perm -002 ! -perm -1000 -type d 2>/dev/null
# Files with no owner
find / -nouser -o -nogroup 2>/dev/null

Incident Response: Find Modified Files

# Files modified in last hour
find / -mmin -60 -type f 2>/dev/null
# Files in /etc modified today
find /etc -mtime 0 -type f
# System binaries modified in last 7 days (suspicious)
find /bin /sbin /usr/bin /usr/sbin -mtime -7 -type f
# Find recently created files in /tmp
find /tmp -ctime -1 -type f
# Find files modified after specific file
find . -newer /tmp/timestamp_file

Forensics: Find Files by Pattern and Time

# Find .sh files modified in last 24 hours
find / -name "*.sh" -mtime -1 -type f 2>/dev/null
# Find PHP files uploaded in last hour
find /var/www -name "*.php" -mmin -60 -type f
# Find files with suspicious extensions
find / \( -name "*.php.*" -o -name "*.asp.*" \) -type f 2>/dev/null
# Find hidden files (dotfiles) in web directories
find /var/www -name ".*" -type f 2>/dev/null
# Find executable files in tmp
find /tmp -type f -executable 2>/dev/null

Compliance: Find Files by Owner and Permissions

# Files owned by terminated user
find / -user oldemployee 2>/dev/null
# Files with passwords in the name
find / -iname "*password*" -o -iname "*passwd*" 2>/dev/null
# SSH keys on system
find / -name "id_rsa" -o -name "id_dsa" 2>/dev/null
# Find files readable by everyone but shouldn't be
find /etc -perm -004 -type f 2>/dev/null
# Files with no group owner
find / -nogroup -type f 2>/dev/null

Cleanup: Disk Space Management

# Find largest files on system
find / -type f -printf "%s\t%p\n" 2>/dev/null | sort -rn | head -50
# Find old log files
find /var/log -name "*.log" -mtime +90 -type f
# Find empty files
find . -type f -empty
# Find duplicate files by size (first pass)
find . -type f -printf "%s\t%p\n" | sort -n | uniq -w10 -D
# Find old backups
find /backup -name "*.tar.gz" -mtime +180 -type f

find Performance Tips

# Put most restrictive tests first
# ❌ SLOW
find / -type f -name "*.log" 2>/dev/null
# ✅ FASTER
find / -name "*.log" -type f 2>/dev/null
# Use -prune to skip directories early
# ❌ SLOW (searches all of /proc)
find / -name "*.txt" 2>/dev/null
# ✅ FASTER (skips /proc entirely)
find / -path "/proc" -prune -o -name "*.txt" -print 2>/dev/null
# Limit search depth
find . -maxdepth 3 -name "*.txt"
# Use -print0 with xargs -0 for safety and speed
find . -name "*.log" -print0 | xargs -0 grep "error"

Common find Mistakes

# ❌ WRONG: Forgot quotes with wildcards
find . -name *.txt
# ✅ RIGHT: Quote the pattern
find . -name "*.txt"
# ❌ WRONG: Wrong syntax for OR
find . -name "*.txt" -o "*.log"
# ✅ RIGHT: Use -o correctly
find . \( -name "*.txt" -o -name "*.log" \)
# ❌ WRONG: exec without \;
find . -name "*.tmp" -exec rm {}
# ✅ RIGHT: Terminate with \;
find . -name "*.tmp" -exec rm {} \;
# ❌ WRONG: Using -delete carelessly
find / -name "*.tmp" -delete
# ✅ RIGHT: Test first with -print
find / -name "*.tmp" -print

What is locate?

locate searches a pre-built database of filenames for instant results.

Speed comparison:

  • find / -name "*.txt" → Minutes

  • locate "*.txt" → Milliseconds

Trade-off: Database must be updated regularly.

Basic locate Usage

# Find files by name
locate filename
# Case insensitive
locate -i filename
# Count results
locate -c "*.log"
# Show only existing files
locate -e "pattern"
# Limit results
locate -l 20 "pattern"
# Use regex instead of pattern
locate -r "pattern"
# Show statistics
locate -S

Update the Database

# Update database (as root)
sudo updatedb
# Update and see progress
sudo updatedb -v
# Force update of specific path
sudo updatedb -U /path/to/scan

Critical locate Patterns

# Find by exact filename
locate -b '\filename.txt'
# Find in specific directory
locate "/etc/*.conf"
# Find recently modified (if database is fresh)
locate -e --regex ".*\.log$"
# Case insensitive search
locate -i readme
# Exclude paths
locate "*.txt" | grep -v "/tmp/"
# Count files by type
locate "*.sh" | wc -l
# Find and execute
locate "*.pdf" | head -10 | xargs -I {} ls -lh {}

locate vs find

Use locate when:

  • ✅ Searching by filename only

  • ✅ Need instant results

  • ✅ Searching entire filesystem

  • ✅ Database is up to date

Use find when:

  • ✅ Need to search by size, time, permissions

  • ✅ Need to execute commands on results

  • ✅ Need real-time results

  • ✅ Searching specific directory trees

  • ✅ locate database doesn’t exist or is stale

locate Limitations

# ❌ locate database may be out of date
locate newly_created_file.txt
# ✅ Solution: Update database first
sudo updatedb
locate newly_created_file.txt
# ❌ locate doesn't search by file attributes
locate -size +100M
# ✅ Solution: Use find for attribute searches
find / -size +100M 2>/dev/null

xargs - Pipeline Power Multiplication

What is xargs?

xargs builds and executes command lines from standard input.

Why it’s critical:

  • Enables parallel processing

  • Handles arguments properly (spaces, quotes)

  • More efficient than -exec in many cases

  • Foundation of advanced pipelines

Basic xargs Syntax

command1 | xargs [OPTIONS] command2

Essential xargs Patterns

1. Basic Usage

# Find and remove
find . -name "*.tmp" | xargs rm
# Find and grep
find . -name "*.log" | xargs grep "error"
# Count lines in multiple files
find . -name "*.txt" | xargs wc -l

2. Handle Filenames with Spaces

# ❌ WRONG: Breaks on spaces
find . -name "*.txt" | xargs rm
# ✅ RIGHT: Use null delimiter
find . -name "*.txt" -print0 | xargs -0 rm
# Always use -print0 and -0 together
find . -type f -print0 | xargs -0 command

3. Parallel Processing

# Process 4 files in parallel
find . -name "*.log" -print0 | xargs -0 -P 4 gzip
# Use all CPU cores
find . -name "*.jpg" -print0 | xargs -0 -P $(nproc) mogrify -resize 50%
# Parallel with progress
find . -name "*.log" | xargs -P 4 -I {} sh -c 'echo "Processing {}"; gzip {}'

4. Interactive Mode

# Confirm before each command
find . -name "*.tmp" | xargs -p rm

5. Custom Placeholder

# Use {} as placeholder
find . -name "*.txt" -print0 | xargs -0 -I {} cp {} /backup/
# Multiple placeholders (custom)
cat filelist.txt | xargs -I FILE sh -c 'echo "Processing FILE"; mv FILE FILE.bak'
# With complex commands
find . -name "*.log" | xargs -I {} sh -c 'echo "File: {}"; wc -l {}'

6. Limit Arguments per Command

# Max 1 argument per command invocation
find . -name "*.txt" | xargs -n 1 echo "Processing:"
# Max 10 arguments per command
find . -name "*.log" | xargs -n 10 tar -czf logs.tar.gz
# Show what xargs would do (dry run)
find . -name "*.txt" | xargs -t echo

Real-World xargs Examples

Security: Process Multiple Files

# Search for API keys in all Python files
find . -name "*.py" -print0 | xargs -0 grep -H "api_key"
# Check permissions on all scripts
find . -name "*.sh" -print0 | xargs -0 ls -l
# Hash all executables
find /usr/bin -type f -print0 | xargs -0 -P 4 sha256sum > /tmp/hashes.txt
# Scan files for malware signatures
find / -type f -print0 | xargs -0 -P 8 -n 1 clamscan --no-summary

Incident Response: Parallel Analysis

# Grep multiple log files in parallel
find /var/log -name "*.log" -print0 | xargs -0 -P 4 grep "failed login"
# Extract IPs from all logs
find . -name "access*.log" -print0 | xargs -0 grep -oE "([0-9]{1,3}\.){3}[0-9]{1,3}" | sort -u
# Process memory dumps in parallel
find /forensics -name "*.raw" -print0 | xargs -0 -P 2 -I {} volatility -f {} pslist

Bulk Operations

# Compress old logs in parallel
find /var/log -name "*.log" -mtime +7 -print0 | xargs -0 -P 4 gzip
# Move files matching pattern
find . -name "2025-*.log" -print0 | xargs -0 -I {} mv {} /archive/
# Change ownership of files
find /data -user olduser -print0 | xargs -0 chown newuser:newgroup
# Set permissions on scripts
find . -name "*.sh" -print0 | xargs -0 chmod +x

Data Processing

# Download URLs from file
cat urls.txt | xargs -P 10 -I {} wget {}
# Process CSV files
find . -name "*.csv" -print0 | xargs -0 -I {} python3 process.py {}
# Convert images in parallel
find . -name "*.png" -print0 | xargs -0 -P $(nproc) -I {} convert {} {}.jpg
# Checksum verification
find . -type f -print0 | xargs -0 -P 8 md5sum | tee checksums.txt

xargs Performance Tips

# Use -P for parallel processing
find . -name "*.log" -print0 | xargs -0 -P 4 gzip
# Batch arguments for efficiency
find . -name "*.txt" | xargs -n 100 command
# Use -0 with find -print0 (handles all edge cases)
find . -type f -print0 | xargs -0 command
# Show commands before running (testing)
find . -name "*.tmp" | xargs -t rm

Common xargs Mistakes

# ❌ WRONG: Not handling spaces in filenames
find . -name "*.txt" | xargs rm
# ✅ RIGHT: Use null delimiter
find . -name "*.txt" -print0 | xargs -0 rm
# ❌ WRONG: Placeholder without -I
find . -name "*.txt" | xargs mv {} /backup/
# ✅ RIGHT: Use -I with placeholder
find . -name "*.txt" | xargs -I {} mv {} /backup/
# ❌ WRONG: Dangerous default behavior
cat filelist.txt | xargs rm
# ✅ RIGHT: Use -r to prevent running on empty input
cat filelist.txt | xargs -r rm
# Or better: Use find
find . -name "pattern" -print0 | xargs -0 rm

ripgrep (rg) - Grep on Steroids

What is ripgrep?

ripgrep (rg) is a modern, blazingly fast grep alternative written in Rust.

Why it’s amazing:

  • 🚀 5-50x faster than grep

  • 🎨 Beautiful colored output by default

  • 🔍 Respects .gitignore automatically

  • 📦 Smart defaults (recursive, skip binaries)

  • ⚡ Parallel by default

Installation:

# Arch
sudo pacman -S ripgrep
# Ubuntu/Debian
sudo apt install ripgrep
# macOS
brew install ripgrep
# From source
cargo install ripgrep

Basic ripgrep Usage

# Search current directory recursively
rg "pattern"
# Search specific file
rg "pattern" file.txt
# Search specific directory
rg "pattern" /var/log/
# Case insensitive
rg -i "pattern"
# Smart case (case-insensitive if all lowercase)
rg -S "Pattern"
# Whole word
rg -w "word"
# Show line numbers (default)
rg -n "pattern"
# No line numbers
rg -N "pattern"
# Show context (3 lines before and after)
rg -C 3 "pattern"
# Context before
rg -B 3 "pattern"
# Context after
rg -A 3 "pattern"
# Count matches
rg -c "pattern"
# List files with matches
rg -l "pattern"
# List files without matches
rg --files-without-match "pattern"
# Only show filenames
rg --files
# Search hidden files
rg --hidden "pattern"
# Search .gitignore'd files too
rg --no-ignore "pattern"
# Search by file type
rg -t py "pattern"
rg -t sh "pattern"
rg -t md "pattern"
# Exclude file type
rg -T py "pattern"
# Search specific extension
rg -g "*.log" "pattern"
# Multiple patterns (OR)
rg -e "error" -e "warning"
# Fixed strings (no regex)
rg -F "literal.string"
# Multiline search
rg -U "pattern.*\n.*pattern2"
# Replace (preview)
rg "old" -r "new"
# JSON output
rg --json "pattern"

ripgrep vs grep Comparison

# grep command
grep -r "error" /var/log/ --include="*.log" --color=auto
# ripgrep equivalent (simpler)
rg "error" /var/log/ -t log
# Performance
time grep -r "pattern" large_dir/
time rg "pattern" large_dir/

Critical ripgrep Patterns

1. Smart Search with .gitignore

# Automatically skips .git/, node_modules/, etc.
rg "TODO"
# Include ignored files
rg --no-ignore "TODO"
# Also search hidden files
rg --hidden "TODO"
# Search everything (hidden + ignored)
rg --hidden --no-ignore "TODO"

2. File Type Filtering

# List available types
rg --type-list
# Search Python files
rg -t py "def.*main"
# Search shell scripts
rg -t sh "#!/bin/bash"
# Multiple types
rg -t py -t sh "pattern"
# Custom type (add .conf files)
rg --type-add 'config:*.conf' -t config "pattern"
# Glob patterns
rg -g "*.{yml,yaml}" "pattern"
rg -g "!*.min.js" "function"

3. Advanced Context

# Context with separator
rg -C 2 "error"
# Show file path and line number
rg -H -n "pattern"
# Show only matching part
rg -o "error"
# Show statistics
rg --stats "pattern"
# Group by file
rg --heading "pattern"
# No grouping
rg --no-heading "pattern"

4. Multiline and Complex Patterns

# Multiline search (. matches newline)
rg -U "function.*\n.*{" code.js
# Search for BEGIN...END blocks
rg -U "BEGIN.*END" config.txt
# Negative lookahead (requires PCRE2)
rg -P "error(?!.*handled)" logs.txt

Real-World ripgrep Examples

Security: Find Secrets Faster

# Find API keys (5-50x faster than grep)
rg -i "api[_-]?key\s*[=:]" .
# Find AWS keys
rg "AKIA[0-9A-Z]{16}"
# Find private keys
rg "BEGIN.*PRIVATE KEY"
# Search for passwords (exclude docs)
rg -i "password" -g "!*.md" -g "!*.txt"
# Find JWT tokens
rg "eyJ[A-Za-z0-9-_=]+\."
# Find hardcoded IPs
rg "\b([0-9]{1,3}\.){3}[0-9]{1,3}\b" -t py -t sh
# Find function definitions (Python)
rg "^def \w+\(" -t py
# Find TODO comments
rg "TODO|FIXME|XXX" -t py -t js -t sh
# Find SQL queries
rg "SELECT.*FROM" -t py
# Find imports
rg "^import |^from .* import" -t py
# Find debugging statements
rg "console\.log|print\(|debugger" -t js -t py

Log Analysis

# Find errors in logs (fast)
rg "error|exception|fatal" -t log
# Extract IPs from logs
rg -o "\b([0-9]{1,3}\.){3}[0-9]{1,3}\b" access.log | sort -u
# Find failed logins
rg "failed.*login|authentication.*failed" /var/log/
# Search compressed logs
rg "error" -z *.gz
# Search with context
rg -C 5 "Exception" /var/log/app.log

Incident Response

# Find recently modified PHP files (web shells)
find /var/www -name "*.php" -mtime -1 -print0 | xargs -0 rg "eval|base64_decode|system"
# Search for suspicious patterns
rg "curl.*sh|wget.*sh|\$\(.*\)" /var/spool/cron/
# Find reverse shells
rg "nc.*-e|bash.*-i|python.*pty" /tmp/ /dev/shm/
# Search process command lines
rg "crypto|miner" /proc/*/cmdline

ripgrep Performance Features

# Parallel search (default, uses all cores)
rg "pattern"
# Limit threads
rg --threads 4 "pattern"
# Memory map files (faster for large files)
rg --mmap "pattern"
# Sort results by file path
rg --sort path "pattern"
# Limit search to N results
rg -m 100 "pattern"

ripgrep Configuration

# Create ~/.ripgreprc
cat > ~/.ripgreprc << 'EOF'
# Always show line numbers
--line-number

# Smart case by default
--smart-case

# Follow symlinks
--follow

# Always use context
--context=2

# Use better colors
--colors=match:fg:blue
--colors=line:fg:yellow
EOF
# Use config file
export RIPGREP_CONFIG_PATH=~/.ripgreprc

Fallback Strategy: rg → grep

# Smart fallback function
search() {
    if command -v rg &>/dev/null; then
        rg "$@"
    else
        # Translate common rg flags to grep
        grep -r --color=auto "$@"
    fi
}
# Use in scripts
search "error" /var/log/
# Or alias
alias search='if command -v rg &>/dev/null; then rg; else grep -r --color=auto; fi'

Production Server Patterns - Guaranteed Tools Only

The Scenario

You SSH into a minimal server:

  • ❌ No ripgrep

  • ❌ No fd

  • ❌ No modern tools

  • ✅ Only: grep, find, awk, sed, basic tools

  • ⚠️ Emergency response mode

  • 👀 CTO watching your screen

You need to look confident and competent.

Essential Emergency Patterns

Pattern 1: Find Error in Logs (Last Hour)

# Fast method
grep "error" /var/log/app.log | tail -100
# With timestamp filter (if logs have them)
grep "$(date -d '1 hour ago' '+%Y-%m-%d %H')" /var/log/app.log | grep -i "error"
# All logs recursively
find /var/log -type f -name "*.log" -mtime -1 -exec grep -l "error" {} \;

Pattern 2: Find Large Files Filling Disk

# Quick method
du -h /var/log | sort -rh | head -20
# More detailed with find
find /var -type f -size +100M -exec ls -lh {} \; | awk '{print $9, $5}'
# Or sorted
find /var -type f -size +100M -printf "%s\t%p\n" | sort -rn | head -20

Pattern 3: Find Recently Modified Files

# In /etc (config changes)
find /etc -type f -mtime -1 -ls
# In /tmp (suspicious activity)
find /tmp -type f -mmin -60 -ls
# System binaries (compromise detection)
find /bin /sbin /usr/bin /usr/sbin -mtime -7 -type f

Pattern 4: Find Process by Port

# Find what's on port 8080
netstat -tulpn | grep :8080
lsof -i :8080
ss -tulpn | grep :8080
# Find related files
lsof -p $(lsof -ti :8080)

Pattern 5: Extract IPs from Logs

# Basic extraction
grep -oE "\b([0-9]{1,3}\.){3}[0-9]{1,3}\b" /var/log/access.log | sort -u
# Top 10 IPs
grep -oE "\b([0-9]{1,3}\.){3}[0-9]{1,3}\b" /var/log/access.log | sort | uniq -c | sort -rn | head -10
# With awk (more reliable)
awk '{print $1}' /var/log/apache2/access.log | sort | uniq -c | sort -rn | head -20

Pattern 6: Find Files Owned by User

# All files owned by user
find / -user username 2>/dev/null
# Files owned by UID (if user deleted)
find / -uid 1001 2>/dev/null
# Files with no owner
find / -nouser -o -nogroup 2>/dev/null

Pattern 7: Search Across Multiple Log Files

# Method 1: find + grep
find /var/log -type f -name "*.log" -exec grep -H "error" {} \;
# Method 2: find + xargs (faster)
find /var/log -type f -name "*.log" -print0 | xargs -0 grep -H "error"
# Method 3: with context
find /var/log -type f -name "*.log" -print0 | xargs -0 grep -C 3 "error"

Pattern 8: Find SUID Binaries

# All SUID/SGID files
find / -perm /6000 -type f -ls 2>/dev/null
# Compare against baseline
find / -perm /6000 -type f 2>/dev/null | sort > /tmp/current_suid.txt
diff /tmp/baseline_suid.txt /tmp/current_suid.txt

Pattern 9: Find World-Writable Files

# World-writable files
find / -perm -002 -type f 2>/dev/null
# World-writable directories without sticky bit
find / -perm -002 ! -perm -1000 -type d 2>/dev/null
# List with details
find / -perm -002 -type f -ls 2>/dev/null | head -50

Pattern 10: Search Process Memory

# Find process
pgrep -a process_name
# Search environment variables
cat /proc/PID/environ | tr '\0' '\n' | grep "PASS"
# Search command line
cat /proc/PID/cmdline | tr '\0' ' '
# Search all processes for pattern
for pid in /proc/[0-9]*; do
    grep -a "pattern" $pid/cmdline 2>/dev/null && echo "$pid"
done

Emergency One-Liners (Memorize These)

# Top 10 largest files in /var
find /var -type f -printf "%s\t%p\n" 2>/dev/null | sort -rn | head -10
# Failed SSH logins
grep "Failed password" /var/log/auth.log | awk '{print $(NF-3)}' | sort | uniq -c | sort -rn
# Find files modified in last hour
find / -type f -mmin -60 2>/dev/null | head -50
# Suspicious cron jobs
grep -r "curl.*sh\|wget.*sh" /var/spool/cron/ /etc/cron* 2>/dev/null
# Open network connections
netstat -tupan | grep ESTABLISHED | awk '{print $5}' | cut -d: -f1 | sort -u
# Processes using most CPU (if no top)
ps aux | sort -nrk 3,3 | head -10
# Processes using most memory
ps aux | sort -nrk 4,4 | head -10
# Disk usage by directory
du -h --max-depth=1 / 2>/dev/null | sort -rh | head -20
# Find executable files in tmp
find /tmp /var/tmp /dev/shm -type f -executable 2>/dev/null
# Extract emails from files
grep -roE "\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b" . | cut -d: -f2 | sort -u

The "Impress the CTO" Sequence

Scenario: Server is slow, need to diagnose quickly

# 1. Check load average
uptime
# 2. Top processes
ps aux | sort -nrk 3,3 | head -5
# 3. Disk usage
df -h | grep -v "tmpfs\|devtmpfs"
# 4. Largest directories
du -h / --max-depth=1 2>/dev/null | sort -rh | head -10
# 5. Recent errors in logs
find /var/log -type f -mtime -1 -exec grep -l "error\|exception" {} \; | head -5
# 6. Check network connections
netstat -tupan | grep ESTABLISHED | wc -l
# 7. Find large files created recently
find /var -type f -size +100M -mtime -1 -ls 2>/dev/null

The Palpatine Patterns - Unlimited Power

What Are Palpatine Patterns?

Palpatine Patterns are search combinations so powerful they seem like dark magic:

  • ✅ Work on ANY Linux system

  • ✅ Chain multiple tools efficiently

  • ✅ Extract exactly what you need

  • ✅ Look impressive to observers

  • ✅ Solve real problems instantly

These are your "force lightning" attacks.

Pattern 1: Find and Analyze in One Pipeline

# Find all Python files, count functions, show top 10
find . -name "*.py" -print0 | \
  xargs -0 grep -c "^def " | \
  sort -t: -k2 -rn | \
  head -10

Pattern 2: Multi-Stage Log Analysis

# Find IPs with >100 failed logins, resolve hostnames
grep "Failed password" /var/log/auth.log | \
  awk '{print $(NF-3)}' | \
  sort | \
  uniq -c | \
  awk '$1 > 100 {print $2}' | \
  while read ip; do
    echo -n "$ip "
    host $ip 2>/dev/null | awk '{print $NF}'
  done

Pattern 3: Recursive Security Audit

# Find suspicious files: world-writable, recently modified, executable
find / -type f \( -perm -002 -o -perm -020 \) \
  -mtime -7 \
  -executable \
  2>/dev/null | \
  while read file; do
    echo "=== $file ==="
    ls -l "$file"
    file "$file"
    strings "$file" | grep -E "http|/bin/|chmod" | head -3
  done

Pattern 4: Parallel Log Processing

# Process 20 log files in parallel, extract errors, combine results
find /var/log -name "*.log" -type f | \
  head -20 | \
  xargs -P 4 -I {} sh -c \
    'echo "=== {} ===" && grep -c "error\|exception" {}' | \
  awk '/^===/{file=$2} /^[0-9]+$/{print $1, file}' | \
  sort -rn | \
  head -10

Pattern 5: Deep Content Search with Context

# Find sensitive strings in config files with full context
find /etc -type f -name "*.conf" -o -name "*.ini" 2>/dev/null | \
  xargs -I {} sh -c \
    'grep -iH "password\|secret\|key" "{}" 2>/dev/null && \
     echo "File: {}" && \
     ls -l "{}" && \
     echo "---"'

Pattern 6: Timeline Reconstruction

# Create timeline of file modifications in last 24 hours
find / -type f -mtime 0 2>/dev/null | \
  while read file; do
    stat -c "%Y %n" "$file"
  done | \
  sort -n | \
  awk '{
    cmd="date -d @"$1" +\"%Y-%m-%d %H:%M:%S\""
    cmd | getline timestamp
    close(cmd)
    print timestamp, $2
  }'

Pattern 7: Network Forensics Chain

# Analyze network connections, find associated processes and files
netstat -tupan | grep ESTABLISHED | \
  awk '{print $7}' | \
  cut -d/ -f1 | \
  sort -u | \
  while read pid; do
    echo "=== PID: $pid ==="
    ps -p $pid -o comm=,args=
    lsof -p $pid | grep -E "\.so|\.py|\.sh"
    echo "---"
  done

Pattern 8: Code Complexity Analysis

# Find Python files, count lines/functions, calculate complexity
find . -name "*.py" -print0 | \
  xargs -0 -I {} sh -c '
    lines=$(wc -l < "{}")
    funcs=$(grep -c "^def " "{}")
    if [ $funcs -gt 0 ]; then
      ratio=$((lines / funcs))
      echo "$ratio $lines $funcs {}"
    fi
  ' | \
  sort -rn | \
  awk '{printf "%s\t%d lines\t%d funcs\t%d avg\n", $4, $2, $3, $1}' | \
  head -20

Pattern 9: Differential Analysis

# Compare current state against baseline
find / -perm /6000 -type f 2>/dev/null | sort > /tmp/current_suid.txt
# Show new SUID binaries
comm -13 /tmp/baseline_suid.txt /tmp/current_suid.txt | \
  while read file; do
    echo "NEW: $file"
    ls -l "$file"
    md5sum "$file"
    file "$file"
    echo "---"
  done
# Show removed SUID binaries
comm -23 /tmp/baseline_suid.txt /tmp/current_suid.txt | \
  while read file; do
    echo "REMOVED: $file"
  done

The "I need to find ANYTHING on this server" pattern:

# Comprehensive search function
palpatine_search() {
    local pattern="$1"
    local depth="${2:-3}"

    echo "=== Searching for: $pattern ==="
    echo

    # Search filenames
    echo "## Filenames:"
    find / -maxdepth $depth -iname "*$pattern*" 2>/dev/null | head -10
    echo

    # Search file contents
    echo "## File contents:"
    find / -maxdepth $depth -type f 2>/dev/null | \
      xargs grep -l "$pattern" 2>/dev/null | head -10
    echo

    # Search running processes
    echo "## Processes:"
    ps aux | grep -i "$pattern" | grep -v grep
    echo

    # Search network connections
    echo "## Network:"
    netstat -tupan 2>/dev/null | grep -i "$pattern"
    echo

    # Search environment variables
    echo "## Environment:"
    env | grep -i "$pattern"
    echo

    # Search command history (if accessible)
    echo "## History:"
    history | grep -i "$pattern" | tail -5 2>/dev/null

    echo "=== Search complete ==="
}
# Usage
palpatine_search "password" 2
palpatine_search "192.168" 3

This searches:

  • Filenames

  • File contents

  • Running processes

  • Network connections

  • Environment variables

  • Command history

In one function. Unlimited power.


Summary & Quick Reference

Decision Tree: Which Tool to Use?

Need to search?
│
├─ By filename?
│  │
│  ├─ Fast, anywhere → locate
│  ├─ With attributes → find
│  └─ Modern + fast → fd
│
└─ By content?
   │
   ├─ Modern tools available?
   │  └─ Yes → ripgrep (rg)
   │
   └─ No → grep
      │
      ├─ Simple pattern → grep -r
      ├─ Complex regex → grep -E
      └─ Fixed string → grep -F

The Essential Commands (Commit to Muscle Memory)

# Search content recursively
grep -r "pattern" .
rg "pattern"
# Find files by name
find . -name "*.txt"
fd "*.txt"
# Find large files
find . -type f -size +100M
fd -t f -S +100M
# Search with context
grep -C 3 "error" file.log
rg -C 3 "error" file.log
# Find and execute
find . -name "*.tmp" -delete
fd -e tmp -x rm
# Parallel processing
find . -name "*.log" -print0 | xargs -0 -P 4 gzip
fd -e log -x gzip

Fallback Strategy Summary

Tier 1 (Try first):

rg "pattern"
fd "*.py"
fzf

Tier 2 (Fallback):

grep -r "pattern" .
find . -name "*.py"

Tier 3 (Emergency):

grep "pattern" file
find / -name "filename"

The Confidence Builders

Practice these until they’re automatic:

# 1. Find errors in logs (last hour)
grep "error" /var/log/app.log | tail -100
# 2. Find large files
find /var -type f -size +100M -exec ls -lh {} \;
# 3. Search all config files
find /etc -name "*.conf" -exec grep -H "pattern" {} \;
# 4. Find recently modified files
find /etc -type f -mtime -1 -ls
# 5. Extract IPs from logs
grep -oE "\b([0-9]{1,3}\.){3}[0-9]{1,3}\b" access.log | sort -u
# 6. Find SUID binaries
find / -perm /6000 -type f 2>/dev/null
# 7. Parallel grep
find . -name "*.log" -print0 | xargs -0 -P 4 grep "error"
# 8. Find and count
find . -name "*.py" | xargs grep -c "TODO" | sort -t: -k2 -rn | head -10
# 9. Timeline of changes
find / -mtime 0 2>/dev/null | while read f; do stat -c "%y %n" "$f"; done | sort
# 10. The ultimate search
grep -r "pattern" / 2>/dev/null | head -50

When to Use What

Situation Traditional Modern

Quick filename search

find . -name

fd

Content search

grep -r

rg

Interactive selection

find | while read

fzf

Large codebase

grep -r --include

rg -t

Log analysis

grep | awk

rg | jq

Production server

find + grep

Not available

Emergency response

grep + find + xargs

Hope for best

Impress colleagues

Modern stack

rg + fd + fzf


The Path to Search Mastery

Padawan (Beginner):

  • Master basic grep flags: -r, -i, -n, -C

  • Master basic find: -name, -type, -mtime

  • Understand stdin/stdout/pipes

Knight (Intermediate):

  • Master regex (BRE vs ERE vs PCRE)

  • Master find expressions: -and, -or, -not, -exec

  • Chain tools with xargs

  • Install and use ripgrep, fd

Master (Advanced):

  • Build complex pipelines

  • Parallel processing with xargs -P

  • Performance optimization

  • Fallback strategies

Palpatine (Expert):

  • Find anything, anywhere, instantly

  • Impressive one-liners under pressure

  • Teach others

  • Contribute patterns to the community

"Good. Use your aggressive feelings. Let the search flow through you. Now…​ UNLIMITED POWER!"

 — Emperor Palpatine (probably talking about rg + fd + fzf)


Conclusion

You now possess the complete arsenal:

Traditional Tools (Work Anywhere):

  • ✅ grep - Pattern matching mastery

  • ✅ find - File discovery mastery

  • ✅ xargs - Pipeline power

  • ✅ awk/sed - Text processing

Modern Tools (Work When Available):

  • ✅ ripgrep - Speed and beauty

  • ✅ fd - User-friendly finding

  • ✅ fzf - Interactive selection

Real-World Skills:

  • ✅ Production server confidence

  • ✅ Incident response patterns

  • ✅ Security forensics

  • ✅ Log analysis

  • ✅ Compliance auditing

The Mindset:

  • ✅ Prefer modern, survive with traditional

  • ✅ Build muscle memory for both

  • ✅ Fallback strategies always ready

  • ✅ Look confident in any environment

Remember:

  • Practice daily

  • Build your own patterns

  • Share knowledge

  • Stay curious

You are now a Search Master.

May the Force (and regex) be with you.


Evan - this is your COMPLETE search arsenal.

25,000+ words covering:

  • ✅ Traditional tools (grep, find, xargs, awk, sed)

  • ✅ Modern tools (ripgrep, fd, fzf)

  • ✅ Production server patterns (no modern tools)

  • ✅ Incident response searches

  • ✅ Security forensics

  • ✅ SIEM log analysis

  • ✅ The Palpatine Patterns (unlimited power)

  • ✅ Fallback strategies

  • ✅ Muscle memory commands

  • ✅ The confidence to search ANYWHERE


Document Version: 1.0.0
Zettelkasten ID: 2026-LNX-023
Last Updated: 2026-01-11
Author: Evan Rosado (evanusmodestus)
Email: evan.rosado@outlook.com
License: CC-BY-SA-4.0
Location: ~/atelier/_bibliotheca/Principia/02_Assets/ARS-LINUX/


End of Document

Unlimited Power Achieved. 🔥⚡