Regex CLI Workout

Stop reading theory. Start typing commands. This workout uses your terminal as the classroom.

First: Understand Your Grep Flags

This is the source of your confusion. There are THREE regex "languages" and grep speaks all of them:

Flag Mode Features When to Use

(none)

BRE (Basic)

Metacharacters need escaping: \+, \?, |, \(\)

Legacy scripts, POSIX compliance

-E

ERE (Extended)

Modern syntax: +, ?, |, () work without escaping

Default choice for most work

-P

PCRE (Perl)

Full power: \d, \w, \s, lookahead, lookbehind

Complex patterns, extraction

The Rule

Use grep -E by default. Use grep -P when you need \d, \w, \s, or lookaround.

# Basic grep (BRE) - annoying, avoid this
grep 'one\+' file.txt        # Need backslash before +

# Extended grep (ERE) - your default
grep -E 'one+' file.txt      # + works naturally

# PCRE grep - when you need shorthand classes
grep -P '\d+\.\d+\.\d+\.\d+' file.txt   # \d works

Setup: Create Your Training File

Copy this entire block and paste into your terminal:

cat << 'EOF' > /tmp/regex-training.txt
# === NETWORK DATA ===
IP: 192.168.1.100
IP: 10.50.1.20
IP: 10.50.1.132
IP: 172.16.0.1
INVALID: 192X168Y1Z100
MAC: AA:BB:CC:DD:EE:FF
MAC: 14:f6:d8:7b:31:80
MAC: 98-BB-1E-1F-A7-13
VLAN 10 - Data
VLAN 100 - Voice
VLAN 999 - Management

# === LOG ENTRIES ===
2026-03-15T10:30:45 [INFO] Server started on port 8080
2026-03-15T10:31:02 [WARN] Low disk space: 15% remaining
2026-03-15T10:31:15 [ERROR] Connection refused to 10.50.1.20:443
2026-03-15T10:32:00 [DEBUG] User evanusmodestus authenticated
2026-03-15T10:33:00 [error] lowercase error level
2026-03-15T10:34:00 [Error] Mixed case error level

# === USER DATA ===
username: admin
username: evanusmodestus
username: root
email: evan@domusdigitalis.dev
email: admin@example.com
email: user.name+tag@domain.co.uk

# === FILE PATHS ===
/etc/ssh/sshd_config
/var/log/syslog
C:\Users\Admin\Documents
C:\Program Files\Application\config.ini

# === PRICES AND MONEY ===
Price: $99.99
Cost: $1,234.56
Total: $0.50

# === CONFIG LINES ===
server=192.168.1.1
port=8080
enabled=true
disabled=false
timeout=30
max_connections=100
EOF

Verify it exists:

wc -l /tmp/regex-training.txt

Expected: 47 /tmp/regex-training.txt

Workout 1: Literal Matching (Warm Up)

Drill Command Expected Output

Find "admin"

grep 'admin' /tmp/regex-training.txt

2 lines (username: admin, email: admin@…​)

Find "ERROR" (case sensitive)

grep 'ERROR' /tmp/regex-training.txt

1 line (uppercase ERROR only)

Find "error" (case insensitive)

grep -i 'error' /tmp/regex-training.txt

3 lines (ERROR, error, Error)

Find lines with "VLAN"

grep 'VLAN' /tmp/regex-training.txt

3 lines (VLAN 10, 100, 999)

Self-check: Run each command. Does your output match?

Workout 2: Escaping Metacharacters

The . matches ANY character. To match a literal dot, escape it with \.

Drill Command Expected Output

BAD: Find "192.168" (unescaped)

grep -E '192.168' /tmp/regex-training.txt

2 lines (including INVALID with X)

GOOD: Find "192.168" (escaped)

grep -E '192\.168' /tmp/regex-training.txt

1 line (only real IP)

Find "$99.99" (escape $ and .)

grep -E '\$99\.99' /tmp/regex-training.txt

1 line (Price: $99.99)

Find "[ERROR]" (escape brackets)

grep -E '\[ERROR\]' /tmp/regex-training.txt

1 line

Key insight: Without escaping, . in "192.168" matches the X in "192X168"!

Workout 3: Character Classes

[…​] matches any single character in the set.

Drill Command Expected Output

Match any digit

grep -E '[0-9]' /tmp/regex-training.txt

Many lines (any line with a number)

Match any hex letter

grep -E '[A-Fa-f]' /tmp/regex-training.txt

Lines with MAC addresses, hex chars

Match vowels only

grep -Eo '[aeiou]+' /tmp/regex-training.txt

Vowel sequences (using -o to show matches only)

New flag: -o shows ONLY the matching part, not the whole line.

Workout 4: Quantifiers

+ = one or more, * = zero or more, ? = zero or one

Drill Command Expected Output

Match numbers (one or more digits)

grep -Eo '[0-9]+' /tmp/regex-training.txt

All numbers extracted: 192, 168, 1, 100, etc.

Match MAC octets

grep -Eo '[A-Fa-f0-9]{2}' /tmp/regex-training.txt

All two-char hex values: AA, BB, 14, f6, etc.

Match IP addresses (basic)

grep -Eo '\.[0-9]\.\.[0-9]' /tmp/regex-training.txt

192.168.1.100, 10.50.1.20, etc.

Workout 5: Anchors

^ = start of line, $ = end of line

Drill Command Expected Output

Lines STARTING with "IP:"

grep -E '^IP:' /tmp/regex-training.txt

4 lines (all IP lines)

Lines STARTING with "#"

grep -E '^#' /tmp/regex-training.txt

Comment lines only

Lines ENDING with "false"

grep -E 'false$' /tmp/regex-training.txt

1 line (disabled=false)

Lines that ARE just a number

grep -E '^[0-9]+$' /tmp/regex-training.txt

Nothing (no lines are ONLY numbers)

Workout 6: PCRE Power (grep -P)

Switch to -P when you need \d, \w, \s:

Shorthand Meaning Equivalent

\d

Digit

[0-9]

\D

Non-digit

[^0-9]

\w

Word character

[A-Za-z0-9_]

\W

Non-word

[^A-Za-z0-9_]

\s

Whitespace

[ \t\n\r]

\S

Non-whitespace

[^ \t\n\r]

Drill Command Expected Output

Extract all numbers

grep -Po '\d+' /tmp/regex-training.txt

Same as [0-9]+ but cleaner

Extract words after "username: "

grep -Po '(?⇐username: )\w+' /tmp/regex-training.txt

admin, evanusmodestus, root

Extract log levels

grep -Po '(?⇐\[)[A-Z]+(?=\])' /tmp/regex-training.txt

INFO, WARN, ERROR, DEBUG

Key insight: (?⇐…​) is lookbehind - match only if preceded by pattern.

Workout 7: Real Infrastructure Patterns

Now apply what you learned:

Extract All Valid IPs

grep -Eo '([0-9]{1,3}\.){3}[0-9]{1,3}' /tmp/regex-training.txt

Expected:

192.168.1.100
10.50.1.20
10.50.1.132
172.16.0.1
10.50.1.20
192.168.1.1

Extract MAC Addresses (both formats)

grep -Eio '[A-F0-9]{2}[:-][A-F0-9]{2}[:-][A-F0-9]{2}[:-][A-F0-9]{2}[:-][A-F0-9]{2}[:-][A-F0-9]{2}' /tmp/regex-training.txt

Expected:

AA:BB:CC:DD:EE:FF
14:f6:d8:7b:31:80
98-BB-1E-1F-A7-13

Extract Email Addresses

grep -Eo '[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,}' /tmp/regex-training.txt

Expected:

evan@domusdigitalis.dev
admin@example.com
user.name+tag@domain.co.uk

Extract Config Key-Value Pairs

grep -E '^[a-z_]+=.+$' /tmp/regex-training.txt

Expected:

server=192.168.1.1
port=8080
enabled=true
disabled=false
timeout=30
max_connections=100

Daily Drill Template

Use this pattern for daily practice:

# 1. Pick a pattern type from curriculum
# 2. Write the pattern
# 3. Test against training file
# 4. Check output matches expectation

# Example: "I want to practice quantifiers"
grep -Eo '[0-9]{2,4}' /tmp/regex-training.txt

# Did it work? Refine until it does.

Quick Reference: grep Flags

Flag Purpose

-E

Extended regex (ERE) - use +, ?, | without escaping

-P

Perl regex (PCRE) - use \d, \w, lookaround

-o

Print only the matched part, not whole line

-i

Case insensitive

-v

Invert match (lines that DON’T match)

-n

Show line numbers

-c

Count matches instead of showing them

-l

Show only filenames that contain matches

-r

Recursive search in directories

-A N

Show N lines After match

-B N

Show N lines Before match

-C N

Show N lines Context (before + after)

Progression

  1. Complete workouts 1-4 using grep -E

  2. When comfortable, do workout 5-6 with grep -P

  3. Practice workout 7 patterns on real logs

  4. Build your own patterns for infrastructure files

Next Steps