Regex Self-Test Challenges

Work through these challenges honestly. Type your pattern FIRST, then expand the answer to check. No cheating - that only cheats yourself.

The answers are hidden in collapsible sections. Try each challenge before revealing.

Setup Test Data

Run this ONCE per session to create your test file:

cat << 'EOF' > /tmp/challenges.txt
The quick brown fox jumps over the lazy dog.
IP: 192.168.1.100
IP: 10.50.1.20
IP: 172.16.0.1
MAC: AA:BB:CC:DD:EE:FF
MAC: 14:F6:D8:7B:31:80
MAC: 98-BB-1E-1F-A7-13
Price: $99.99
Total: $1,234.56
Path: C:\Users\Admin\Documents
Path: /home/evan/atelier
URL: https://api.example.com/v1/users?id=123
URL: http://192.168.1.1:8080/login
[ERROR] Connection refused to 10.50.1.50:389
[WARN] Certificate expires in 30 days
[INFO] Server started on port 443
[DEBUG] Request from 172.16.0.50
error: authentication failed for user 'admin'
ERROR: LDAPS connection timeout
Port 22 SSH
Port 443 HTTPS
Port 3389 RDP
VLAN 10 - Data
VLAN 20 - Voice
VLAN 99 - Management
user=evan domain=inside.domusdigitalis.dev
user=admin domain=CORP
Session: abc123-def456-789xyz
JWT: eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJzdWIiOiIxMjM0NTY3ODkwIn0.dozjgNryP4J3jVmNHl0w5N_XgL0n3I9PlFUP0THsR8U
framed_ip_address=10.50.10.100
calling_station_id=14:F6:D8:7B:31:80
endpoint_policy=Compliant
192.168.1.100 - evan [14/Mar/2026:10:23:45 +0000] "GET /api HTTP/1.1" 200 1234
10.50.1.20 - - [14/Mar/2026:10:23:46 +0000] "POST /login HTTP/1.1" 401 89
duplicate duplicate word word
the the error
filename.txt
config.ini
script.sh
backup.tar.gz
EOF
echo "Test data created: /tmp/challenges.txt"

Level 1: Fundamentals

Challenge 1.1: Match Literal Text

Goal: Find lines containing the word "fox"

Answer
grep 'fox' /tmp/challenges.txt

Output: The quick brown fox jumps over the lazy dog.


Challenge 1.2: Case Insensitive

Goal: Find ALL lines with "error" (any case: ERROR, error, Error)

Answer
grep -i 'error' /tmp/challenges.txt

Output:

[ERROR] Connection refused to 10.50.1.50:389
error: authentication failed for user 'admin'
ERROR: LDAPS connection timeout
the the error

Challenge 1.3: Escape the Dot

Goal: Match ONLY 192.168.1.100 (not lines where . matches any char)

Answer
grep '192\.168\.1\.100' /tmp/challenges.txt

The \. matches a literal period, not "any character"


Challenge 1.4: Match Brackets

Goal: Find [ERROR] including the brackets

Answer
grep '\[ERROR\]' /tmp/challenges.txt

Brackets are metacharacters (character class) - escape them with \


Challenge 1.5: Match Dollar Sign

Goal: Find the line with $99.99

Answer
grep '\$99\.99' /tmp/challenges.txt

$ means end-of-line in regex - escape it for literal $


Level 2: Character Classes

Challenge 2.1: Any Digit

Goal: Extract lines containing any digit 0-9

Answer
grep '[0-9]' /tmp/challenges.txt

[0-9] matches any single digit


Challenge 2.2: Only IPv4 Looking Lines

Goal: Find lines that look like IP addresses (digits and dots pattern)

Answer
grep '[0-9]*\.[0-9]*\.[0-9]*\.[0-9]*' /tmp/challenges.txt

Or with ERE:

grep -E '[0-9]+\.[0-9]+\.[0-9]+\.[0-9]+' /tmp/challenges.txt

Challenge 2.3: MAC Address Pattern (Colon)

Goal: Find MAC addresses in XX:XX:XX:XX:XX:XX format

Answer
grep -E '[0-9A-Fa-f]{2}(:[0-9A-Fa-f]{2}){5}' /tmp/challenges.txt
  • [0-9A-Fa-f]{2} = two hex chars

  • (:[0-9A-Fa-f]{2}){5} = colon + two hex, repeated 5 times


Challenge 2.4: MAC Address Pattern (Hyphen)

Goal: Find MAC addresses in XX-XX-XX-XX-XX format (Windows style)

Answer
grep -E '[0-9A-Fa-f]{2}(-[0-9A-Fa-f]{2}){5}' /tmp/challenges.txt

Same pattern but with - instead of :


Challenge 2.5: Uppercase Only

Goal: Find words that are ALL UPPERCASE (like ERROR, LDAPS)

Answer
grep -oE '\b[A-Z]{2,}\b' /tmp/challenges.txt
  • \b = word boundary

  • [A-Z]{2,} = 2+ uppercase letters

  • -o = only matching part


Level 3: Quantifiers

Challenge 3.1: Port Numbers

Goal: Extract port numbers (1-5 digit numbers)

Answer
grep -oE 'Port [0-9]{1,5}' /tmp/challenges.txt

Or for JUST the number:

grep -oP '(?<=Port )[0-9]{1,5}' /tmp/challenges.txt

Challenge 3.2: VLAN IDs

Goal: Extract VLAN numbers (VLAN followed by space and digits)

Answer
grep -oE 'VLAN [0-9]+' /tmp/challenges.txt

Or just the number:

grep -oP 'VLAN \K[0-9]+' /tmp/challenges.txt

\K resets the match start (PCRE)


Challenge 3.3: Optional Character

Goal: Match both "color" and "colour" (imagine they’re in the file)

Answer
grep -E 'colou?r' file

u? means "zero or one u"


Challenge 3.4: One or More

Goal: Match log levels: [ERROR], [WARN], [INFO], [DEBUG]

Answer
grep -E '\[[A-Z]+\]' /tmp/challenges.txt

[A-Z]+ = one or more uppercase letters


Challenge 3.5: Exactly N Times

Goal: Find exactly 6-character hex strings (like parts of MAC)

Answer
grep -oE '\b[0-9A-Fa-f]{6}\b' /tmp/challenges.txt

{6} = exactly 6 occurrences


Level 4: Anchors

Challenge 4.1: Start of Line

Goal: Find lines that START with "IP:"

Answer
grep '^IP:' /tmp/challenges.txt

^ anchors to start of line


Challenge 4.2: End of Line

Goal: Find lines ending with a port number (like :389 or :443)

Answer
grep -E ':[0-9]+$' /tmp/challenges.txt

$ anchors to end of line


Challenge 4.3: Word Boundary

Goal: Find "admin" as a complete word only (not "administrator")

Answer
grep -P '\badmin\b' /tmp/challenges.txt

\b is word boundary (requires PCRE -P)

ERE alternative:

grep -wE 'admin' /tmp/challenges.txt

Challenge 4.4: Empty Lines

Goal: Count empty lines in a file

Answer
grep -c '^$' /tmp/challenges.txt

^$ = start immediately followed by end = empty line


Challenge 4.5: Full Line Match

Goal: Match ONLY the line that is exactly duplicate duplicate word word

Answer
grep '^duplicate duplicate word word$' /tmp/challenges.txt

^…​$ anchors both ends for exact match


Level 5: Groups and Alternation

Challenge 5.1: Either/Or

Goal: Find lines with either "ERROR" or "WARN"

Answer
grep -E '(ERROR|WARN)' /tmp/challenges.txt

(A|B) means A or B


Challenge 5.2: HTTP Methods

Goal: Match GET or POST in the log lines

Answer
grep -E '"(GET|POST)' /tmp/challenges.txt

Challenge 5.3: HTTP Status Categories

Goal: Find 4xx OR 5xx HTTP status codes

Answer
grep -E '" [45][0-9]{2} ' /tmp/challenges.txt

[45] = 4 or 5 [0-9]{2} = any two digits


Challenge 5.4: File Extensions

Goal: Match files ending in .txt, .ini, or .sh

Answer
grep -E '\.(txt|ini|sh)$' /tmp/challenges.txt

Group the extensions, anchor to end


Challenge 5.5: Repeated Group

Goal: Find the duplicate words pattern like "duplicate duplicate"

Answer
grep -E '\b(\w+)\s+\1\b' /tmp/challenges.txt
  • (\w+) captures a word in group 1

  • \s+ one or more spaces

  • \1 backreference to group 1 (same word again)


Level 6: Lookahead and Lookbehind (PCRE)

Challenge 6.1: Positive Lookahead

Goal: Find "IP" only if followed by a colon

Answer
grep -oP 'IP(?=:)' /tmp/challenges.txt

(?=:) = followed by : (but don’t include it in match)


Challenge 6.2: Extract Value After Key

Goal: Extract the IP address from framed_ip_address=10.50.10.100

Answer
grep -oP '(?<=framed_ip_address=)[0-9.]+' /tmp/challenges.txt

(?⇐…​) = preceded by (lookbehind)


Challenge 6.3: Extract Port After Colon

Goal: Extract port number from 10.50.1.50:389

Answer
grep -oP '(?<=:)[0-9]+(?![0-9.])' /tmp/challenges.txt

Or simpler:

grep -oP ':\K[0-9]+' /tmp/challenges.txt

\K resets match start position


Challenge 6.4: Negative Lookahead

Goal: Find "domain=" NOT followed by "CORP"

Answer
grep -P 'domain=(?!CORP)' /tmp/challenges.txt

(?!CORP) = NOT followed by CORP


Challenge 6.5: Negative Lookbehind

Goal: Find usernames that are NOT preceded by "non-" (imagine the data has it)

Answer
grep -P '(?<!non-)admin' /tmp/challenges.txt

(?<!…​) = NOT preceded by


Level 7: Advanced Extraction

Challenge 7.1: Extract Username from Log

Goal: From 192.168.1.100 - evan [14/Mar/…​, extract just "evan"

Answer
grep -oP '^\S+\s+-\s+\K\w+(?=\s+\[)' /tmp/challenges.txt

Or simpler with awk:

awk '{print $3}' /tmp/challenges.txt | grep -v '^-$'

Challenge 7.2: Extract Domain from user= line

Goal: From user=evan domain=inside.domusdigitalis.dev, extract the domain

Answer
grep -oP '(?<=domain=)\S+' /tmp/challenges.txt

Challenge 7.3: JWT Token Detection

Goal: Find JWT tokens (three base64 parts separated by dots)

Answer
grep -oP 'eyJ[A-Za-z0-9_-]+\.eyJ[A-Za-z0-9_-]+\.[A-Za-z0-9_-]+' /tmp/challenges.txt

JWT starts with eyJ (base64 of {")


Challenge 7.4: HTTP Path Extraction

Goal: Extract the path (like /api, /login) from HTTP log lines

Answer
grep -oP '"\K(GET|POST)\s+\K/\S+(?=\s+HTTP)' /tmp/challenges.txt

Or with two greps:

grep -oP '"(GET|POST) \K[^ ]+' /tmp/challenges.txt

Challenge 7.5: All Unique IP Addresses

Goal: Extract ALL unique IPv4 addresses from the file

Answer
grep -oP '\b\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}\b' /tmp/challenges.txt | sort -u

Level 8: sed Transformations

Challenge 8.1: Replace Text

Goal: Replace all "ERROR" with "CRITICAL"

Answer
sed 's/ERROR/CRITICAL/g' /tmp/challenges.txt

g = global (all occurrences on each line)


Challenge 8.2: Delete Lines

Goal: Delete all lines containing "DEBUG"

Answer
sed '/DEBUG/d' /tmp/challenges.txt

Challenge 8.3: Extract with sed

Goal: Extract just the IP from "IP: 192.168.1.100"

Answer
sed -n 's/^IP: \(.*\)/\1/p' /tmp/challenges.txt

Or with ERE:

sed -nE 's/^IP: (.*)/\1/p' /tmp/challenges.txt

Challenge 8.4: Swap Fields

Goal: Transform "user=evan" to "evan=user"

Answer
echo "user=evan" | sed -E 's/(\w+)=(\w+)/\2=\1/'

Capture groups \1 and \2, then swap in replacement


Challenge 8.5: Add Prefix

Goal: Add ">>> " prefix to lines containing ERROR

Answer
sed '/ERROR/s/^/>>> /' /tmp/challenges.txt

Address /ERROR/ then substitute at start ^


Level 9: awk with Regex

Challenge 9.1: Filter by Pattern

Goal: Print lines matching IP pattern

Answer
awk '/[0-9]+\.[0-9]+\.[0-9]+\.[0-9]+/' /tmp/challenges.txt

Challenge 9.2: Field Matching

Goal: Print lines where first field starts with "10."

Answer
awk '$1 ~ /^10\./' /tmp/challenges.txt

$1 ~ /pattern/ = field 1 matches pattern


Challenge 9.3: Negative Match

Goal: Print lines where first field does NOT start with "192."

Answer
awk '$1 !~ /^192\./' /tmp/challenges.txt

!~ = does not match


Challenge 9.4: gsub Replacement

Goal: Replace all colons with hyphens in MAC addresses

Answer
awk '/MAC:/ {gsub(/:/, "-"); print}' /tmp/challenges.txt

gsub(/old/, "new") = global substitution


Challenge 9.5: Extract with gensub

Goal: Extract the HTTP status code from log lines

Answer
awk '/HTTP/ {print gensub(/.*" ([0-9]{3}) .*/, "\\1", "g")}' /tmp/challenges.txt

gensub allows capture groups (GNU awk)


Level 10: Infrastructure Patterns

Challenge 10.1: Valid IPv4 (Strict)

Goal: Match ONLY valid IPv4 (0-255 in each octet, not 999.999.999.999)

Answer
grep -P '\b(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9]?[0-9])\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9]?[0-9])\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9]?[0-9])\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9]?[0-9])\b' /tmp/challenges.txt

This validates 0-255 per octet


Challenge 10.2: Private IP Ranges Only

Goal: Match only RFC1918 private IPs (10.x, 172.16-31.x, 192.168.x)

Answer
grep -E '\b(10\.[0-9.]+|172\.(1[6-9]|2[0-9]|3[01])\.[0-9.]+|192\.168\.[0-9.]+)\b' /tmp/challenges.txt

Challenge 10.3: CIDR Notation

Goal: Match IP with CIDR (like 10.0.0.0/8, 192.168.1.0/24)

Answer
grep -E '\b[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}/[0-9]{1,2}\b' file

Challenge 10.4: Certificate Days Remaining

Goal: Extract the number from "expires in 30 days"

Answer
grep -oP 'expires in \K[0-9]+(?= days)' /tmp/challenges.txt

Challenge 10.5: Session ID Format

Goal: Match session IDs like abc123-def456-789xyz

Answer
grep -oE '[a-z0-9]+-[a-z0-9]+-[a-z0-9]+' /tmp/challenges.txt

Bonus: Real-World Security Patterns

Challenge: AWS Key Detection

Goal: Find AWS access keys (start with AKIA, followed by 16 uppercase alphanumerics)

Answer
grep -oP 'AKIA[0-9A-Z]{16}' file

Challenge: Password in URL

Goal: Find URLs with embedded passwords like ://user:password@host

Answer
grep -P '://[^:]+:[^@]+@' file
This pattern finds credentials in URLs - use for security audits

Challenge: Private Key Header

Goal: Find private key file markers

Answer
grep -E '-----BEGIN.*PRIVATE KEY-----' file

Self-Assessment

After completing these challenges, rate yourself:

Level Criteria

Beginner

Completed Level 1-3 without looking at answers

Intermediate

Completed Level 4-6 without looking at answers

Advanced

Completed Level 7-9 without looking at answers

Expert

Completed Level 10 + Bonus without looking at answers

If you had to peek at answers, redo those challenges tomorrow until you can do them cold.