Regex Self-Test Challenges
Work through these challenges honestly. Type your pattern FIRST, then expand the answer to check. No cheating - that only cheats yourself.
| The answers are hidden in collapsible sections. Try each challenge before revealing. |
Setup Test Data
Run this ONCE per session to create your test file:
cat << 'EOF' > /tmp/challenges.txt
The quick brown fox jumps over the lazy dog.
IP: 192.168.1.100
IP: 10.50.1.20
IP: 172.16.0.1
MAC: AA:BB:CC:DD:EE:FF
MAC: 14:F6:D8:7B:31:80
MAC: 98-BB-1E-1F-A7-13
Price: $99.99
Total: $1,234.56
Path: C:\Users\Admin\Documents
Path: /home/evan/atelier
URL: https://api.example.com/v1/users?id=123
URL: http://192.168.1.1:8080/login
[ERROR] Connection refused to 10.50.1.50:389
[WARN] Certificate expires in 30 days
[INFO] Server started on port 443
[DEBUG] Request from 172.16.0.50
error: authentication failed for user 'admin'
ERROR: LDAPS connection timeout
Port 22 SSH
Port 443 HTTPS
Port 3389 RDP
VLAN 10 - Data
VLAN 20 - Voice
VLAN 99 - Management
user=evan domain=inside.domusdigitalis.dev
user=admin domain=CORP
Session: abc123-def456-789xyz
JWT: eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJzdWIiOiIxMjM0NTY3ODkwIn0.dozjgNryP4J3jVmNHl0w5N_XgL0n3I9PlFUP0THsR8U
framed_ip_address=10.50.10.100
calling_station_id=14:F6:D8:7B:31:80
endpoint_policy=Compliant
192.168.1.100 - evan [14/Mar/2026:10:23:45 +0000] "GET /api HTTP/1.1" 200 1234
10.50.1.20 - - [14/Mar/2026:10:23:46 +0000] "POST /login HTTP/1.1" 401 89
duplicate duplicate word word
the the error
filename.txt
config.ini
script.sh
backup.tar.gz
EOF
echo "Test data created: /tmp/challenges.txt"
Level 1: Fundamentals
Challenge 1.1: Match Literal Text
Goal: Find lines containing the word "fox"
Answer
grep 'fox' /tmp/challenges.txt
Output: The quick brown fox jumps over the lazy dog.
Challenge 1.2: Case Insensitive
Goal: Find ALL lines with "error" (any case: ERROR, error, Error)
Answer
grep -i 'error' /tmp/challenges.txt
Output:
[ERROR] Connection refused to 10.50.1.50:389 error: authentication failed for user 'admin' ERROR: LDAPS connection timeout the the error
Challenge 1.3: Escape the Dot
Goal: Match ONLY 192.168.1.100 (not lines where . matches any char)
Answer
grep '192\.168\.1\.100' /tmp/challenges.txt
The \. matches a literal period, not "any character"
Challenge 1.4: Match Brackets
Goal: Find [ERROR] including the brackets
Answer
grep '\[ERROR\]' /tmp/challenges.txt
Brackets are metacharacters (character class) - escape them with \
Challenge 1.5: Match Dollar Sign
Goal: Find the line with $99.99
Answer
grep '\$99\.99' /tmp/challenges.txt
$ means end-of-line in regex - escape it for literal $
Level 2: Character Classes
Challenge 2.1: Any Digit
Goal: Extract lines containing any digit 0-9
Answer
grep '[0-9]' /tmp/challenges.txt
[0-9] matches any single digit
Challenge 2.2: Only IPv4 Looking Lines
Goal: Find lines that look like IP addresses (digits and dots pattern)
Answer
grep '[0-9]*\.[0-9]*\.[0-9]*\.[0-9]*' /tmp/challenges.txt
Or with ERE:
grep -E '[0-9]+\.[0-9]+\.[0-9]+\.[0-9]+' /tmp/challenges.txt
Challenge 2.3: MAC Address Pattern (Colon)
Goal: Find MAC addresses in XX:XX:XX:XX:XX:XX format
Answer
grep -E '[0-9A-Fa-f]{2}(:[0-9A-Fa-f]{2}){5}' /tmp/challenges.txt
-
[0-9A-Fa-f]{2}= two hex chars -
(:[0-9A-Fa-f]{2}){5}= colon + two hex, repeated 5 times
Challenge 2.4: MAC Address Pattern (Hyphen)
Goal: Find MAC addresses in XX-XX-XX-XX-XX format (Windows style)
Answer
grep -E '[0-9A-Fa-f]{2}(-[0-9A-Fa-f]{2}){5}' /tmp/challenges.txt
Same pattern but with - instead of :
Challenge 2.5: Uppercase Only
Goal: Find words that are ALL UPPERCASE (like ERROR, LDAPS)
Answer
grep -oE '\b[A-Z]{2,}\b' /tmp/challenges.txt
-
\b= word boundary -
[A-Z]{2,}= 2+ uppercase letters -
-o= only matching part
Level 3: Quantifiers
Challenge 3.1: Port Numbers
Goal: Extract port numbers (1-5 digit numbers)
Answer
grep -oE 'Port [0-9]{1,5}' /tmp/challenges.txt
Or for JUST the number:
grep -oP '(?<=Port )[0-9]{1,5}' /tmp/challenges.txt
Challenge 3.2: VLAN IDs
Goal: Extract VLAN numbers (VLAN followed by space and digits)
Answer
grep -oE 'VLAN [0-9]+' /tmp/challenges.txt
Or just the number:
grep -oP 'VLAN \K[0-9]+' /tmp/challenges.txt
\K resets the match start (PCRE)
Challenge 3.3: Optional Character
Goal: Match both "color" and "colour" (imagine they’re in the file)
Answer
grep -E 'colou?r' file
u? means "zero or one u"
Challenge 3.4: One or More
Goal: Match log levels: [ERROR], [WARN], [INFO], [DEBUG]
Answer
grep -E '\[[A-Z]+\]' /tmp/challenges.txt
[A-Z]+ = one or more uppercase letters
Challenge 3.5: Exactly N Times
Goal: Find exactly 6-character hex strings (like parts of MAC)
Answer
grep -oE '\b[0-9A-Fa-f]{6}\b' /tmp/challenges.txt
{6} = exactly 6 occurrences
Level 4: Anchors
Challenge 4.1: Start of Line
Goal: Find lines that START with "IP:"
Answer
grep '^IP:' /tmp/challenges.txt
^ anchors to start of line
Challenge 4.2: End of Line
Goal: Find lines ending with a port number (like :389 or :443)
Answer
grep -E ':[0-9]+$' /tmp/challenges.txt
$ anchors to end of line
Challenge 4.3: Word Boundary
Goal: Find "admin" as a complete word only (not "administrator")
Answer
grep -P '\badmin\b' /tmp/challenges.txt
\b is word boundary (requires PCRE -P)
ERE alternative:
grep -wE 'admin' /tmp/challenges.txt
Challenge 4.4: Empty Lines
Goal: Count empty lines in a file
Answer
grep -c '^$' /tmp/challenges.txt
^$ = start immediately followed by end = empty line
Challenge 4.5: Full Line Match
Goal: Match ONLY the line that is exactly duplicate duplicate word word
Answer
grep '^duplicate duplicate word word$' /tmp/challenges.txt
^…$ anchors both ends for exact match
Level 5: Groups and Alternation
Challenge 5.1: Either/Or
Goal: Find lines with either "ERROR" or "WARN"
Answer
grep -E '(ERROR|WARN)' /tmp/challenges.txt
(A|B) means A or B
Challenge 5.2: HTTP Methods
Goal: Match GET or POST in the log lines
Answer
grep -E '"(GET|POST)' /tmp/challenges.txt
Challenge 5.3: HTTP Status Categories
Goal: Find 4xx OR 5xx HTTP status codes
Answer
grep -E '" [45][0-9]{2} ' /tmp/challenges.txt
[45] = 4 or 5
[0-9]{2} = any two digits
Challenge 5.4: File Extensions
Goal: Match files ending in .txt, .ini, or .sh
Answer
grep -E '\.(txt|ini|sh)$' /tmp/challenges.txt
Group the extensions, anchor to end
Challenge 5.5: Repeated Group
Goal: Find the duplicate words pattern like "duplicate duplicate"
Answer
grep -E '\b(\w+)\s+\1\b' /tmp/challenges.txt
-
(\w+)captures a word in group 1 -
\s+one or more spaces -
\1backreference to group 1 (same word again)
Level 6: Lookahead and Lookbehind (PCRE)
Challenge 6.1: Positive Lookahead
Goal: Find "IP" only if followed by a colon
Answer
grep -oP 'IP(?=:)' /tmp/challenges.txt
(?=:) = followed by : (but don’t include it in match)
Challenge 6.2: Extract Value After Key
Goal: Extract the IP address from framed_ip_address=10.50.10.100
Answer
grep -oP '(?<=framed_ip_address=)[0-9.]+' /tmp/challenges.txt
(?⇐…) = preceded by (lookbehind)
Challenge 6.3: Extract Port After Colon
Goal: Extract port number from 10.50.1.50:389
Answer
grep -oP '(?<=:)[0-9]+(?![0-9.])' /tmp/challenges.txt
Or simpler:
grep -oP ':\K[0-9]+' /tmp/challenges.txt
\K resets match start position
Challenge 6.4: Negative Lookahead
Goal: Find "domain=" NOT followed by "CORP"
Answer
grep -P 'domain=(?!CORP)' /tmp/challenges.txt
(?!CORP) = NOT followed by CORP
Challenge 6.5: Negative Lookbehind
Goal: Find usernames that are NOT preceded by "non-" (imagine the data has it)
Answer
grep -P '(?<!non-)admin' /tmp/challenges.txt
(?<!…) = NOT preceded by
Level 7: Advanced Extraction
Challenge 7.1: Extract Username from Log
Goal: From 192.168.1.100 - evan [14/Mar/…, extract just "evan"
Answer
grep -oP '^\S+\s+-\s+\K\w+(?=\s+\[)' /tmp/challenges.txt
Or simpler with awk:
awk '{print $3}' /tmp/challenges.txt | grep -v '^-$'
Challenge 7.2: Extract Domain from user= line
Goal: From user=evan domain=inside.domusdigitalis.dev, extract the domain
Answer
grep -oP '(?<=domain=)\S+' /tmp/challenges.txt
Challenge 7.3: JWT Token Detection
Goal: Find JWT tokens (three base64 parts separated by dots)
Answer
grep -oP 'eyJ[A-Za-z0-9_-]+\.eyJ[A-Za-z0-9_-]+\.[A-Za-z0-9_-]+' /tmp/challenges.txt
JWT starts with eyJ (base64 of {")
Challenge 7.4: HTTP Path Extraction
Goal: Extract the path (like /api, /login) from HTTP log lines
Answer
grep -oP '"\K(GET|POST)\s+\K/\S+(?=\s+HTTP)' /tmp/challenges.txt
Or with two greps:
grep -oP '"(GET|POST) \K[^ ]+' /tmp/challenges.txt
Challenge 7.5: All Unique IP Addresses
Goal: Extract ALL unique IPv4 addresses from the file
Answer
grep -oP '\b\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}\b' /tmp/challenges.txt | sort -u
Level 8: sed Transformations
Challenge 8.1: Replace Text
Goal: Replace all "ERROR" with "CRITICAL"
Answer
sed 's/ERROR/CRITICAL/g' /tmp/challenges.txt
g = global (all occurrences on each line)
Challenge 8.2: Delete Lines
Goal: Delete all lines containing "DEBUG"
Answer
sed '/DEBUG/d' /tmp/challenges.txt
Challenge 8.3: Extract with sed
Goal: Extract just the IP from "IP: 192.168.1.100"
Answer
sed -n 's/^IP: \(.*\)/\1/p' /tmp/challenges.txt
Or with ERE:
sed -nE 's/^IP: (.*)/\1/p' /tmp/challenges.txt
Challenge 8.4: Swap Fields
Goal: Transform "user=evan" to "evan=user"
Answer
echo "user=evan" | sed -E 's/(\w+)=(\w+)/\2=\1/'
Capture groups \1 and \2, then swap in replacement
Challenge 8.5: Add Prefix
Goal: Add ">>> " prefix to lines containing ERROR
Answer
sed '/ERROR/s/^/>>> /' /tmp/challenges.txt
Address /ERROR/ then substitute at start ^
Level 9: awk with Regex
Challenge 9.1: Filter by Pattern
Goal: Print lines matching IP pattern
Answer
awk '/[0-9]+\.[0-9]+\.[0-9]+\.[0-9]+/' /tmp/challenges.txt
Challenge 9.2: Field Matching
Goal: Print lines where first field starts with "10."
Answer
awk '$1 ~ /^10\./' /tmp/challenges.txt
$1 ~ /pattern/ = field 1 matches pattern
Challenge 9.3: Negative Match
Goal: Print lines where first field does NOT start with "192."
Answer
awk '$1 !~ /^192\./' /tmp/challenges.txt
!~ = does not match
Challenge 9.4: gsub Replacement
Goal: Replace all colons with hyphens in MAC addresses
Answer
awk '/MAC:/ {gsub(/:/, "-"); print}' /tmp/challenges.txt
gsub(/old/, "new") = global substitution
Challenge 9.5: Extract with gensub
Goal: Extract the HTTP status code from log lines
Answer
awk '/HTTP/ {print gensub(/.*" ([0-9]{3}) .*/, "\\1", "g")}' /tmp/challenges.txt
gensub allows capture groups (GNU awk)
Level 10: Infrastructure Patterns
Challenge 10.1: Valid IPv4 (Strict)
Goal: Match ONLY valid IPv4 (0-255 in each octet, not 999.999.999.999)
Answer
grep -P '\b(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9]?[0-9])\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9]?[0-9])\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9]?[0-9])\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9]?[0-9])\b' /tmp/challenges.txt
This validates 0-255 per octet
Challenge 10.2: Private IP Ranges Only
Goal: Match only RFC1918 private IPs (10.x, 172.16-31.x, 192.168.x)
Answer
grep -E '\b(10\.[0-9.]+|172\.(1[6-9]|2[0-9]|3[01])\.[0-9.]+|192\.168\.[0-9.]+)\b' /tmp/challenges.txt
Challenge 10.3: CIDR Notation
Goal: Match IP with CIDR (like 10.0.0.0/8, 192.168.1.0/24)
Answer
grep -E '\b[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}/[0-9]{1,2}\b' file
Challenge 10.4: Certificate Days Remaining
Goal: Extract the number from "expires in 30 days"
Answer
grep -oP 'expires in \K[0-9]+(?= days)' /tmp/challenges.txt
Challenge 10.5: Session ID Format
Goal: Match session IDs like abc123-def456-789xyz
Answer
grep -oE '[a-z0-9]+-[a-z0-9]+-[a-z0-9]+' /tmp/challenges.txt
Bonus: Real-World Security Patterns
Challenge: AWS Key Detection
Goal: Find AWS access keys (start with AKIA, followed by 16 uppercase alphanumerics)
Answer
grep -oP 'AKIA[0-9A-Z]{16}' file
Challenge: Password in URL
Goal: Find URLs with embedded passwords like ://user:password@host
Answer
grep -P '://[^:]+:[^@]+@' file
| This pattern finds credentials in URLs - use for security audits |
Challenge: Private Key Header
Goal: Find private key file markers
Answer
grep -E '-----BEGIN.*PRIVATE KEY-----' file
Self-Assessment
After completing these challenges, rate yourself:
| Level | Criteria |
|---|---|
Beginner |
Completed Level 1-3 without looking at answers |
Intermediate |
Completed Level 4-6 without looking at answers |
Advanced |
Completed Level 7-9 without looking at answers |
Expert |
Completed Level 10 + Bonus without looking at answers |
| If you had to peek at answers, redo those challenges tomorrow until you can do them cold. |