Regex Session 02: Quantifiers, Flavors & System Practice
Comprehensive quantifier mastery, understanding regex flavors across tools (grep, Python, JavaScript), and practicing on your actual system instead of just websites.
Regex Flavors: The Reality
Different tools use different regex "engines" with slightly different syntax:
| Flavor | Tools | Key Differences |
|---|---|---|
BRE (Basic) |
|
|
ERE (Extended) |
|
|
PCRE (Perl) |
|
Lookahead, lookbehind, |
JavaScript |
Browsers, Node.js, regexr.com |
Similar to PCRE, some differences |
Vim |
Vim/Neovim |
|
Python |
|
PCRE-like, named groups |
Practice Methods Comparison
| Method | Pros | Cons |
|---|---|---|
regexr.com |
Visual, instant feedback, explains patterns |
JavaScript only, not your actual tools |
regex101.com |
Multi-flavor (PCRE, JS, Python, Go), debugger |
Still a website |
grep/ripgrep |
Real tool you’ll use, fast |
No visual highlighting |
Python REPL |
Interactive, real PCRE, scriptable |
More typing |
Recommendation: Learn on regexr.com for visualization, then IMMEDIATELY practice with grep/ripgrep on real files.
Test File Setup
Create a test file with infrastructure data:
cat << 'EOF' > /tmp/regex-practice.txt
# Network Infrastructure Log
2026-03-15T10:30:45 INFO Server started on 192.168.1.100:443
2026-03-15T10:30:46 INFO VLAN 100 configured on Gi1/0/24
2026-03-15T10:31:00 WARN Connection slow to 10.50.1.20
2026-03-15T10:31:15 ERROR Connection refused from 10.50.1.50:8080
2026-03-15T10:32:00 INFO User evanusmodestus authenticated via 802.1X
2026-03-15T10:32:01 INFO MAC AA:BB:CC:DD:EE:FF assigned to VLAN 100
2026-03-15T10:33:00 ERROR Authentication failed for user admin
2026-03-15T10:33:30 WARN Certificate expires in 30 days
2026-03-15T10:34:00 INFO Backup completed: 1.2GB transferred
2026-03-15T10:35:00 DEBUG Query took 145ms for endpoint /api/v1/users
IP Range: 10.50.1.0/24
Gateway: 10.50.1.1
DNS: 10.50.1.90, 10.50.1.91
Ports: 22, 80, 443, 8080, 8443
MAC Table:
00:1A:2B:3C:4D:5E -> VLAN 10
AA:BB:CC:DD:EE:FF -> VLAN 100
11:22:33:44:55:66 -> VLAN 200
EOF
The Complete Quantifier Set
| Quantifier | Meaning | Example | Matches |
|---|---|---|---|
|
0 or more |
|
"ac", "abc", "abbc", "abbbc" |
|
1 or more |
|
"abc", "abbc", "abbbc" (NOT "ac") |
|
0 or 1 |
|
"color", "colour" |
|
Exactly n |
|
"192", "168", "100" (exactly 3 digits) |
|
n or more |
|
2+ digit numbers |
|
Between n and m |
|
1-3 digit numbers |
BRE vs ERE vs PCRE - Side by Side
Match one or more digits
# BRE (basic grep) - must escape +
grep '[0-9]\+' /tmp/regex-practice.txt
# ERE (extended grep -E) - no escaping
grep -E '[0-9]+' /tmp/regex-practice.txt
# PCRE (grep -P or ripgrep) - shorthand \d works
grep -P '\d+' /tmp/regex-practice.txt
rg '\d+' /tmp/regex-practice.txt
All three produce the same result - lines containing digits.
Match IP addresses
# ERE - verbose
grep -E '[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}' /tmp/regex-practice.txt
# PCRE - cleaner with \d
grep -P '\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}' /tmp/regex-practice.txt
# ripgrep - same as PCRE
rg '\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}' /tmp/regex-practice.txt
Match MAC addresses
# ERE
grep -E '([A-Fa-f0-9]{2}:){5}[A-Fa-f0-9]{2}' /tmp/regex-practice.txt
# PCRE
grep -P '([A-Fa-f0-9]{2}:){5}[A-Fa-f0-9]{2}' /tmp/regex-practice.txt
Quantifier Practice - Exact Count {n}
# Match exactly 4 digits (years, ports)
grep -E '[0-9]{4}' /tmp/regex-practice.txt
# Match exactly 2 hex chars
grep -E '[A-Fa-f0-9]{2}' /tmp/regex-practice.txt
Quantifier Practice - Range {n,m}
# Match 1-3 digits (IP octets)
grep -oE '[0-9]{1,3}' /tmp/regex-practice.txt | head -20
# Match 2-4 digit numbers
grep -oE '\b[0-9]{2,4}\b' /tmp/regex-practice.txt
The -o flag shows ONLY the matched text, not the whole line.
Quantifier Practice - Open-ended {n,}
# Match 3 or more digits
grep -oE '[0-9]{3,}' /tmp/regex-practice.txt
Greedy vs Lazy (CRITICAL CONCEPT)
The Problem: Quantifiers are GREEDY by default - they match as MUCH as possible.
# Create test
echo 'Log: "error: disk full" and "warning: low memory"' > /tmp/greedy-test.txt
# Greedy (PCRE only - grep -P)
grep -oP '".*"' /tmp/greedy-test.txt
# Output: "error: disk full" and "warning: low memory" (one match)
# Lazy (PCRE only)
grep -oP '".*?"' /tmp/greedy-test.txt
# Output:
# "error: disk full"
# "warning: low memory"
Lazy Quantifier Reference
Add ? after a quantifier to make it LAZY (match as LITTLE as possible):
| Greedy | Lazy | Behavior |
|---|---|---|
|
|
Match minimum (0 preferred) |
|
|
Match minimum (1 preferred) |
|
|
Match minimum (n preferred) |
Lazy quantifiers (*?, +?) require PCRE (grep -P or rg).
|
Shorthand Character Classes
These save typing:
| Shorthand | Equivalent | Meaning |
|---|---|---|
|
|
Digit |
|
|
NOT a digit |
|
|
Word character |
|
|
NOT word character |
|
|
Whitespace |
|
|
NOT whitespace |
# Match IP with shorthand
grep -P '\d+\.\d+\.\d+\.\d+' /tmp/regex-practice.txt
# Match all words
grep -oP '\w+' /tmp/regex-practice.txt | head -20
Anchors (Position Matching)
| Anchor | Meaning |
|---|---|
|
Start of line |
|
End of line |
|
Word boundary |
# Match digits at START of line
grep -E '^\d+' /tmp/regex-practice.txt
# Match digits at END of line
grep -oE '\d+$' /tmp/regex-practice.txt
# Match VLAN as whole word
grep -E '\bVLAN\b' /tmp/regex-practice.txt
Python REPL Practice
For interactive exploration with PCRE:
python3
import re
text = """
2026-03-15T10:30:45 INFO Server started on 192.168.1.100:443
MAC: AA:BB:CC:DD:EE:FF assigned to VLAN 100
"""
# Find all IPs
re.findall(r'\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}', text)
# ['192.168.1.100']
# Find all MACs (non-capturing group)
re.findall(r'(?:[A-Fa-f0-9]{2}:){5}[A-Fa-f0-9]{2}', text)
# ['AA:BB:CC:DD:EE:FF']
# Find timestamps
re.findall(r'\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2}', text)
# ['2026-03-15T10:30:45']
# Greedy vs lazy
text2 = '"first" and "second"'
re.findall(r'".*"', text2) # Greedy: ['"first" and "second"']
re.findall(r'".*?"', text2) # Lazy: ['"first"', '"second"']
JavaScript Practice (Browser Console)
Open browser DevTools (F12) → Console:
const text = `
2026-03-15T10:30:45 INFO Server started on 192.168.1.100:443
MAC: AA:BB:CC:DD:EE:FF assigned to VLAN 100
`;
// Find all IPs
text.match(/\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}/g)
// ['192.168.1.100']
// Find all MACs
text.match(/([A-Fa-f0-9]{2}:){5}[A-Fa-f0-9]{2}/g)
// ['AA:BB:CC:DD:EE:FF']
// Greedy vs lazy
const text2 = '"first" and "second"';
text2.match(/".*"/g) // Greedy: ['"first" and "second"']
text2.match(/".*?"/g) // Lazy: ['"first"', '"second"']
Flavor Compatibility Cheat Sheet
| Feature | BRE | ERE | PCRE/JS/Python |
|---|---|---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
NO |
NO |
YES |
|
NO |
NO |
YES |
|
NO |
NO |
YES |
|
NO |
NO |
YES |
|
NO |
NO |
YES |
|
NO |
NO |
YES |
Comprehensive Exercise Set
Run these on your system:
# 1. Find all ERROR lines
grep -E 'ERROR' /tmp/regex-practice.txt
# 2. Find all IP addresses (extract only)
grep -oE '\b[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\b' /tmp/regex-practice.txt
# 3. Find all MAC addresses
grep -oE '([A-Fa-f0-9]{2}:){5}[A-Fa-f0-9]{2}' /tmp/regex-practice.txt
# 4. Find all VLAN numbers
grep -oE 'VLAN [0-9]+' /tmp/regex-practice.txt
# 5. Find timestamps
grep -oP '\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2}' /tmp/regex-practice.txt
# 6. Find port numbers (after colon)
grep -oP ':\d{2,5}\b' /tmp/regex-practice.txt
# 7. Find all usernames (word after "user" or "User")
grep -oiP '(?<=user )\w+' /tmp/regex-practice.txt
# 8. Find lines with warnings or errors (case insensitive)
grep -iE 'warn|error' /tmp/regex-practice.txt
# 9. Find durations in milliseconds
grep -oP '\d+ms' /tmp/regex-practice.txt
# 10. Find data sizes (like 1.2GB)
grep -oP '\d+\.?\d*[KMGT]B' /tmp/regex-practice.txt
Infrastructure Power Patterns
| Pattern | Use Case |
|---|---|
|
Valid port number (1-65535) |
|
ISO timestamp |
|
VLAN with number |
|
MAC address |
|
IPv4 address |
|
Username after "user " (lookbehind) |
|
Quoted strings (lazy) |
Next Concepts
Once quantifiers feel solid:
-
Groups & Capturing -
()to extract parts of matches -
Non-capturing groups -
(?:)for grouping without capturing -
Alternation -
\|for OR logic -
Lookahead -
(?=)and(?!)for conditional matching -
Lookbehind -
(?⇐)and(?<!)for matching based on what precedes -
Named groups -
(?P<name>)in Python for readable extractions
Session Reflection
What clicked:
-
<Write what made sense>
What’s still fuzzy:
-
<Write what needs more practice>
Connection to work:
-
<How will you use this?>
Session Log
| Timestamp | Notes |
|---|---|
2026-03-15 |
Completed quantifiers deep dive, flavor comparison, system practice |