Drill 06: Alternation
The alternation operator | provides OR logic in regex. Combined with grouping, it allows matching multiple alternative patterns. Understanding precedence and efficiency is key to using it correctly.
Core Concepts
| Syntax | Meaning | Example |
|---|---|---|
|
Match a OR b |
|
|
Grouped alternation |
|
|
Non-capturing alternation |
|
|
Single character OR |
|
Alternation vs Character Class
| Use Case | Alternation | Character Class |
|---|---|---|
Single characters |
|
|
Multiple characters |
|
N/A |
Ranges |
|
|
Negation |
Complex |
|
Rule: Use character classes for single-character alternatives. Use alternation for multi-character patterns.
Interactive CLI Drill
bash ~/atelier/_bibliotheca/domus-captures/docs/modules/ROOT/examples/regex-drills/06-alternation.sh
Exercise Set 1: Basic Alternation
cat << 'EOF' > /tmp/ex-alt.txt
ERROR: Connection failed
Warning: Low disk space
INFO: Process started
error: lowercase error
WARNING: Memory low
WARN: Config outdated
success: Operation completed
FATAL: System crash
EOF
Ex 1.1: Match ERROR or FATAL lines
Solution
grep -E 'ERROR|FATAL' /tmp/ex-alt.txt
Output: Lines with ERROR or FATAL
Ex 1.2: Match any log level (case-sensitive)
Solution
grep -E 'ERROR|WARNING|WARN|INFO|FATAL' /tmp/ex-alt.txt
Ex 1.3: Match log levels case-insensitively
Solution
grep -Ei 'error|warning|warn|info|fatal|success' /tmp/ex-alt.txt
# Or more efficiently:
grep -Ei '(error|warn(ing)?|info|fatal|success)' /tmp/ex-alt.txt
Ex 1.4: Match ERROR but not error
Solution
grep -E '^ERROR:' /tmp/ex-alt.txt
# Combines anchor with literal match
Exercise Set 2: Grouped Alternation
cat << 'EOF' > /tmp/ex-grouped.txt
Monday meeting
Tuesday review
Wednesday standup
Thursday demo
Friday retrospective
Saturday off
Sunday off
Mondays are tough
The Monday blues
EOF
Ex 2.1: Match weekdays only (Mon-Fri)
Solution
grep -E '(Mon|Tues|Wednes|Thurs|Fri)day' /tmp/ex-grouped.txt
Grouping (Mon|Tues|Wednes|Thurs|Fri) followed by literal day.
Ex 2.2: Match weekend days
Solution
grep -E '(Satur|Sun)day' /tmp/ex-grouped.txt
Ex 2.3: Match any day with optional 's' (Mondays)
Solution
grep -E '(Mon|Tues|Wednes|Thurs|Fri|Satur|Sun)days?' /tmp/ex-grouped.txt
The s? makes the trailing 's' optional.
Exercise Set 3: Protocol and Format Patterns
cat << 'EOF' > /tmp/ex-protocols.txt
http://example.com
https://secure.example.com
ftp://files.example.com
ssh://server.example.com
file:///local/path
http://10.50.1.100/api
https://api.example.com:443/v1
EOF
Ex 3.1: Match HTTP or HTTPS URLs
Solution
grep -E 'https?://' /tmp/ex-protocols.txt
# Or explicitly:
grep -E '(http|https)://' /tmp/ex-protocols.txt
The s? makes 's' optional - more concise than alternation for single char.
Ex 3.2: Match common protocols (http, https, ftp, ssh)
Solution
grep -E '(https?|ftp|ssh)://' /tmp/ex-protocols.txt
Ex 3.3: Extract domain from URL
Solution
grep -oP '(https?|ftp|ssh)://\K[^/:]+' /tmp/ex-protocols.txt
\K resets match to show only domain.
Output: example.com, secure.example.com, etc.
Exercise Set 4: File Extensions
cat << 'EOF' > /tmp/ex-files.txt
document.pdf
report.docx
data.xlsx
image.png
photo.jpg
script.sh
config.yaml
settings.json
archive.tar.gz
backup.zip
notes.txt
code.py
EOF
Ex 4.1: Match document files (pdf, docx, xlsx)
Solution
grep -E '\.(pdf|docx?|xlsx?)$' /tmp/ex-files.txt
docx? matches doc or docx, xlsx? matches xls or xlsx.
Ex 4.2: Match image files
Solution
grep -Ei '\.(png|jpe?g|gif|bmp|svg)$' /tmp/ex-files.txt
jpe?g matches jpg or jpeg.
Ex 4.3: Match config files
Solution
grep -Ei '\.(ya?ml|json|ini|conf|cfg|toml)$' /tmp/ex-files.txt
ya?ml matches yml or yaml.
Ex 4.4: Match scripts
Solution
grep -Ei '\.(sh|bash|py|rb|pl|js|ts)$' /tmp/ex-files.txt
Exercise Set 5: Network Patterns
cat << 'EOF' > /tmp/ex-network.txt
interface GigabitEthernet0/1
interface FastEthernet0/24
interface TenGigabitEthernet1/0/1
interface Ethernet1
VLAN 10 - Data
VLAN 20 - Voice
VLAN 100 - Management
permit tcp any host 10.50.1.50 eq 443
permit udp any host 10.50.1.50 eq 53
deny tcp any any eq 23
EOF
Ex 5.1: Match interface types
Solution
grep -E '(Gigabit|Fast|TenGigabit)?Ethernet' /tmp/ex-network.txt
Ex 5.2: Match permit or deny lines
Solution
grep -E '^(permit|deny)' /tmp/ex-network.txt
Ex 5.3: Match TCP or UDP rules
Solution
grep -E '(permit|deny) (tcp|udp)' /tmp/ex-network.txt
Ex 5.4: Match common ports
Solution
grep -E 'eq (22|23|53|80|443|3389)' /tmp/ex-network.txt
Real-World Applications
Professional: ISE Log Analysis
# Match authentication outcomes
grep -E '(Passed|Failed)-(Authentication|Attempt)' /var/log/ise-psc.log
# Match ISE event types
grep -Ei '(authentication|authorization|accounting|profiling)' /var/log/ise-psc.log
# Match error codes
grep -E 'Error (11|12|13|24|27)' /var/log/ise-psc.log
Professional: Network Device Logs
# Match interface status messages
grep -E '(up|down|changed state to)' /var/log/switch.log
# Match Cisco severity levels
grep -E '%[A-Z]+-([0-3])-' /var/log/cisco.log # Critical (0-3)
# Match spanning tree events
grep -Ei '(STP|RSTP|MSTP|BPDU)' /var/log/switch.log
Professional: Service Status
# Match systemd unit states
systemctl list-units | grep -E '(failed|inactive|activating)'
# Match running services
systemctl --type=service | grep -E '(running|exited)'
# Match critical services
systemctl status | grep -E '(sshd|nginx|docker|kubelet)'
Personal: Note Organization
# Find TODO markers
grep -ri '(TODO|FIXME|HACK|XXX):' ~/notes/
# Find priority tags
grep -Ei '#(urgent|important|priority|deadline)' ~/notes/*.md
# Find date formats
grep -E '(Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec) [0-9]{1,2}' ~/journal/
Personal: Financial Tracking
# Match expense categories
grep -Ei '(groceries|utilities|rent|insurance|entertainment)' ~/budget/*.csv
# Match transaction types
grep -Ei '(debit|credit|transfer|payment|deposit)' ~/bank/*.txt
# Match currencies
grep -E '\$(USD|EUR|GBP|CAD|AUD)' ~/receipts/
Personal: Calendar/Schedule
# Match days of week
grep -Ei '(monday|tuesday|wednesday|thursday|friday|saturday|sunday)' ~/calendar.txt
# Match time periods
grep -Ei '(morning|afternoon|evening|night)' ~/schedule.txt
# Match meeting types
grep -Ei '(meeting|call|standup|review|demo)' ~/calendar/*.ics
Tool Variants
grep: Alternation Patterns
# Basic alternation
grep -E 'cat|dog' file.txt
# With word boundaries
grep -wE 'cat|dog' file.txt
# Count matches per alternative (use multiple greps)
echo "Cats: $(grep -c 'cat' file.txt), Dogs: $(grep -c 'dog' file.txt)"
# Case-insensitive
grep -Ei 'error|warning|critical' logs.txt
sed: Alternation in Substitution
# Replace multiple patterns with same replacement
sed -E 's/(ERROR|FATAL|CRITICAL)/[ALERT]/g' file.txt
# Add prefix to alternatives
sed -E 's/(Mon|Tues|Wednes|Thurs|Fri)day/Weekday: &/g' file.txt
# Remove any of several patterns
sed -E 's/(DEBUG|TRACE):.*//' file.txt
# Convert alternatives
sed -E 's/(http|https)/SECURE/g' urls.txt
awk: Alternation Matching
# Match lines with alternatives
awk '/(ERROR|WARN|FATAL)/' file.txt
# Count alternatives separately
awk '/ERROR/{e++} /WARN/{w++} /FATAL/{f++} END{print "E:",e,"W:",w,"F:",f}' file.txt
# Process based on match
awk '/(ERROR|FATAL)/{print "CRITICAL:", $0} /(WARN|INFO)/{print "NORMAL:", $0}' file.txt
# Field-specific matching
awk '$1 ~ /(permit|deny)/ {print}' acl.txt
vim: Alternation Patterns
" Find ERROR or FATAL /\v(ERROR|FATAL) " Replace log levels :%s/\v(DEBUG|TRACE)/VERBOSE/g " Find function or class definitions /\v(function|class|def)\s+\w+ " Find common typos :%s/\v(teh|hte)/the/g " Highlight alternatives (very magic mode) /\v(Monday|Tuesday|Wednesday|Thursday|Friday)
Python: Alternation
import re
text = """ERROR: Connection failed
WARNING: Low memory
INFO: Process started
FATAL: System crash"""
# Basic alternation
pattern = re.compile(r'(ERROR|FATAL)')
matches = pattern.findall(text)
print(matches) # ['ERROR', 'FATAL']
# Named groups with alternation
pattern = re.compile(r'(?P<level>ERROR|WARNING|INFO|FATAL): (?P<msg>.+)')
for match in pattern.finditer(text):
print(f"{match.group('level')}: {match.group('msg')}")
# Case-insensitive alternation
pattern = re.compile(r'error|warning|fatal', re.IGNORECASE)
# Alternation with grouping
days = re.compile(r'(Mon|Tues|Wednes|Thurs|Fri|Satur|Sun)day')
Precedence and Grouping
Alternation Has Lowest Precedence
# WRONG: Matches "gray" or "grey" NOT "gr" + "a|e" + "y"
echo "gray grey" | grep -oE 'gray|grey'
# Understanding precedence:
# 'ab|cd' means 'ab' OR 'cd', NOT 'a' + 'b|c' + 'd'
# Use grouping for partial alternation:
echo "gray grey" | grep -oE 'gr(a|e)y'
# Output: gray, grey
Common Grouping Patterns
# Optional prefix
grep -E '(un)?happy' file.txt # "happy" or "unhappy"
# Optional suffix
grep -E 'run(ning|s)?' file.txt # "run", "running", or "runs"
# Multiple alternatives in sequence
grep -E '(Mon|Tues|Wednes)day (morning|afternoon)' file.txt
# Nested grouping
grep -E '((http|https)://)?www\.' file.txt
Ordering Alternatives
Most Specific First
# WRONG: "light" never matches (matched by "li")
echo "light flight" | grep -oE 'li|light'
# Output: li, li
# CORRECT: Longer/specific patterns first
echo "light flight" | grep -oE 'light|li'
# Output: light
# For file extensions:
# WRONG: .doc matches first in .docx
grep -E '\.(doc|docx)' files.txt
# CORRECT: More specific first
grep -E '\.(docx|doc)' files.txt
Most Common First (Performance)
For performance, put the most likely match first:
# If INFO is most common in logs:
grep -E '(INFO|WARN|ERROR)' huge.log # INFO checked first
# vs
grep -E '(ERROR|WARN|INFO)' huge.log # ERROR checked first (rarely matches)
Gotchas
Forgetting to Group
# WRONG: Matches "grey" or "gray" but captures everything
echo "The grey cat" | grep -oE 'grey|gray'
# Intended: Just the color
# CORRECT with word boundary
echo "The grey cat" | grep -oE '\b(grey|gray)\b'
Alternation vs Character Class
# INEFFICIENT: Alternation for single chars
grep -E 'a|e|i|o|u' file.txt
# EFFICIENT: Character class
grep -E '[aeiou]' file.txt
# Character class is MUCH faster for large files
Escaping the Pipe in Different Contexts
# Shell quoting required
grep -E 'cat|dog' file.txt # Works (single quotes)
grep -E "cat|dog" file.txt # Works (double quotes, no $ in pattern)
grep -E cat\|dog file.txt # Works (escaped)
grep -E cat|dog file.txt # FAILS (pipe goes to shell)
# BRE requires escaping the pipe
grep 'cat\|dog' file.txt # BRE (no -E)
grep -E 'cat|dog' file.txt # ERE (with -E)
Empty Alternatives
# WRONG: Empty alternative matches everything
echo "test" | grep -E '|test'
# Matches: "" (empty), "test" - every position matches!
# CORRECT: Make it explicit
echo "test" | grep -E 'test|$' # Match "test" or end of line
Key Takeaways
| Concept | Remember |
|---|---|
|
Basic OR - match a or b |
|
Grouped alternation - match "ac" or "bc" |
|
Character class - faster than |
Order matters |
More specific/longer patterns first |
Precedence |
Alternation is lowest - use |
Performance |
Most common match first in long lists |
BRE vs ERE |
BRE: |
Self-Test
-
What’s the difference between
cat|dogand[cd][ao][tg]? -
Why does
gray|greywork but you might prefergr(a|e)y? -
What does
(Mon|Tues)day?match? -
When should you use
[abc]instead ofa|b|c? -
Why put longer patterns first in alternation?
Answers
-
cat|dogmatches "cat" OR "dog";[cd][ao][tg]matches any combination like "cat", "dog", "cot", "dag", etc. -
gray|greyworks butgr(a|e)yis more explicit about the structure and slightly more efficient -
"Monday", "Tuesday", "Monda", "Tuesda" (the
?makes finalyoptional - probably not intended!) -
For single-character alternatives - character class is faster
-
Because regex engines try left-to-right; "light|li" - "li" matches first, "light" never reached
Next Drill
Drill 07: Lookahead - Master (?=…) positive and (?!…) negative lookahead assertions.