grep

grep searches for patterns in files and input streams. Master the three regex flavors and context options to find anything instantly.

Basic Syntax

grep [options] pattern [file...]

Regex Flavors

Flavor Flag Features

BRE (Basic)

default

.*^$[] work, (){}+?| are literal

ERE (Extended)

-E

All metacharacters work

PCRE (Perl)

-P

Lookahead, lookbehind, \\d, \\w

Use -E for modern regex. Use -P for advanced features like lookaround.

Pattern Matching

Literal String

grep "error" /var/log/syslog

Case Insensitive

grep -i "error" file.log

Whole Word

grep -w "error" file.log    # Matches "error" not "errors"

Whole Line

grep -x "exact line" file    # Line must match exactly

Fixed Strings (No Regex)

grep -F "192.168.1.1" file   # Treat . as literal

Regular Expressions (ERE)

Metacharacters

# Any character
grep -E "err.r" file         # error, err0r

# Start of line
grep -E "^ERROR" file

# End of line
grep -E "failed$" file

# Or
grep -E "error|warning" file

# Optional
grep -E "colou?r" file       # color or colour

# One or more
grep -E "go+gle" file        # gogle, google, gooogle

# Zero or more
grep -E "go*gle" file        # ggle, gogle, google

# Repetition
grep -E "a{3}" file          # aaa
grep -E "a{2,4}" file        # aa, aaa, aaaa

Character Classes

# Digits
grep -E "[0-9]+" file

# Letters
grep -E "[a-zA-Z]+" file

# Alphanumeric
grep -E "[a-zA-Z0-9_]+" file

# NOT these characters
grep -E "[^0-9]" file

POSIX Classes

grep -E "[[:digit:]]+" file     # 0-9
grep -E "[[:alpha:]]+" file     # a-zA-Z
grep -E "[[:alnum:]]+" file     # a-zA-Z0-9
grep -E "[[:space:]]" file      # whitespace
grep -E "[[:upper:]]" file      # A-Z
grep -E "[[:lower:]]" file      # a-z
grep -E "[[:punct:]]" file      # punctuation

Groups and Backreferences

# Capturing group
grep -E "(error|warning): (.*)" file

# Backreference
grep -E "(word).*\1" file      # word appears twice

PCRE Features (-P)

Shorthand Classes

grep -P "\d+" file             # Digits [0-9]
grep -P "\w+" file             # Word chars [a-zA-Z0-9_]
grep -P "\s+" file             # Whitespace
grep -P "\S+" file             # Non-whitespace
grep -P "\b\w+\b" file         # Word boundaries

Lookahead

# Positive lookahead (followed by)
grep -P "user(?=\d+)" file     # "user" followed by digits

# Negative lookahead (not followed by)
grep -P "user(?!\d)" file      # "user" NOT followed by digit

Lookbehind

# Positive lookbehind (preceded by)
grep -oP "(?<=user=)\w+" file  # Word after "user="

# Negative lookbehind
grep -P "(?<!root)@" file      # @ not preceded by "root"

Non-Greedy

# Greedy (default) - matches longest
grep -oP "<.*>" file           # <tag>content</tag>

# Non-greedy - matches shortest
grep -oP "<.*?>" file          # <tag>

Output Control

Show Only Matching Part

grep -o "error" file
grep -oE "[0-9]+\.[0-9]+\.[0-9]+\.[0-9]+" file   # Extract IPs

Count Matches

grep -c "error" file           # Lines with matches
grep -o "error" file | wc -l   # Total occurrences

Line Numbers

grep -n "error" file           # Show line numbers

Filenames Only

grep -l "error" *.log          # Files containing pattern
grep -L "error" *.log          # Files NOT containing pattern

Suppress Filename

grep -h "error" *.log          # No filename prefix

With Filename (force)

grep -H "error" file.log       # Always show filename

Context

Lines After

grep -A3 "error" file          # 3 lines after match

Lines Before

grep -B3 "error" file          # 3 lines before match

Lines Around (Context)

grep -C3 "error" file          # 3 lines before AND after

Inversion

Invert Match

grep -v "debug" file           # Lines NOT containing "debug"

Combine with Other Options

grep -v "^#" file | grep -v "^$"   # No comments, no empty lines
grep -vE "^(#|$)" file             # Same, one command

Basic Recursive

grep -r "pattern" directory/

With Include/Exclude

# Only certain files
grep -r --include="*.py" "import" .

# Exclude files
grep -r --exclude="*.log" "error" .

# Exclude directories
grep -r --exclude-dir=".git" "TODO" .
grep -r --exclude-dir={.git,node_modules} "pattern" .

Binary Files

# Treat as text
grep -a "pattern" binary_file

# Suppress binary matches
grep -I "pattern" files        # Skip binary files

# Show if matches
grep -l "pattern" binary_file

Multiple Patterns

OR (Any Pattern)

# Multiple -e flags
grep -e "error" -e "warning" file

# ERE alternation
grep -E "error|warning" file

# From file
grep -f patterns.txt file

AND (All Patterns)

# Chain greps
grep "error" file | grep "fatal"

# Or use awk
awk '/error/ && /fatal/' file

Performance

Fixed Strings (-F)

# Faster for literal strings
grep -F "192.168.1.1" file     # No regex parsing

First Match Only

grep -m1 "pattern" file        # Stop after first match

Quiet Mode

# Just check existence
if grep -q "pattern" file; then
    echo "found"
fi

Infrastructure Patterns

Find IP Addresses

grep -oE "[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}" file

# PCRE version
grep -oP "\b\d{1,3}(\.\d{1,3}){3}\b" file

Find MAC Addresses

grep -oE "([0-9A-Fa-f]{2}:){5}[0-9A-Fa-f]{2}" file

Find Email Addresses

grep -oE "[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}" file

Find URLs

grep -oE "https?://[^[:space:]]+" file

Find Config Values

# Value after key=
grep -oP "(?<=DB_HOST=).*" .env

# YAML value
grep -oP "(?<=host: ).*" config.yaml

Auth Failures in Logs

grep -E "(failed|denied|invalid)" /var/log/auth.log

Find ISE Auth Failures

grep "auth_result.*FAILED" ise_radius.log
grep -oP "user_name=\K[^,]+" ise_radius.log | sort | uniq -c

Find MITRE Techniques

grep -oE "T[0-9]{4}" file

Find JSON Field

grep -oP '"severity"\s*:\s*"\K[^"]+' file.json

Combining with Other Tools

grep → awk

# Filter then process
grep "ERROR" log | awk '{print $1, $NF}'

grep → sed

# Filter then transform
grep "pattern" file | sed 's/old/new/'

grep → sort → uniq

# Count unique matches
grep -oE "[0-9]+\.[0-9]+\.[0-9]+\.[0-9]+" log | sort | uniq -c | sort -rn

find → grep

find . -name "*.log" -exec grep -l "error" {} \;

# Or with xargs
find . -name "*.log" | xargs grep -l "error"

Practice Exercises

Exercise 1: Count Errors by Type

grep -oE "(ERROR|WARN|INFO)" app.log | sort | uniq -c

Exercise 2: Extract User from Log

grep -oP "user=\K\w+" auth.log | sort -u

Exercise 3: Find Config Issues

grep -rn "TODO\|FIXME\|XXX" --include="*.adoc" .

Exercise 4: Network Connections

ss -tn | grep ESTABLISHED | awk '{print $5}' | grep -oE "[0-9.]+:" | sort | uniq -c

Quick Reference

# Flags
-i          # Case insensitive
-v          # Invert match
-c          # Count matches
-l          # Files with matches
-L          # Files without matches
-n          # Line numbers
-o          # Only matching part
-E          # Extended regex
-P          # Perl regex
-F          # Fixed strings
-w          # Whole word
-x          # Whole line
-r          # Recursive
-A N        # N lines after
-B N        # N lines before
-C N        # N lines context
-m N        # Max N matches
-q          # Quiet (exit code only)

Key Takeaways

  1. -E for modern regex (ERE)

  2. -P for lookaround (PCRE)

  3. -o to extract matches

  4. -r --include for recursive search

  5. -v to invert (show non-matches)

  6. -A/-B/-C for context

  7. -F for faster literal search

Next Module

sed - Stream editing and text transformation.