Drill 01: Fundamentals
Before anything else, master these basics: what matches literally, what’s special, and how to escape.
Core Concepts
| Concept | What You Need to Know |
|---|---|
Literal matching |
Most characters match themselves: |
Metacharacters |
Special characters with regex meaning: |
Escaping |
Use backslash to match metacharacters literally: |
Case sensitivity |
Regex is case-sensitive by default; use flags to change |
Metacharacter Reference
| Char | Meaning | To Match Literally |
|---|---|---|
|
Any single character (except newline) |
|
|
Zero or more of previous |
|
|
One or more of previous |
|
|
Zero or one of previous |
|
|
Start of line/string |
|
|
End of line/string |
|
|
Alternation (OR) |
|
|
Group start |
|
|
Group end |
|
|
Character class start |
|
|
Character class end |
|
|
Quantifier start |
|
|
Quantifier end |
|
|
Escape character |
|
Interactive CLI Drill
Run the practice script:
bash ~/atelier/_bibliotheca/domus-captures/docs/modules/ROOT/examples/regex-drills/01-fundamentals.sh
Exercise Set 1: Literal Matching
cat << 'EOF' > /tmp/ex01.txt
The server responded with 200 OK
Error: Connection failed
Warning: Low disk space
server.example.com
10.50.1.20
EOF
Ex 1.1: Match "server"
Solution
grep 'server' /tmp/ex01.txt
Output:
The server responded with 200 OK server.example.com
Ex 1.2: Case-insensitive "error"
Solution
grep -i 'error' /tmp/ex01.txt
Output:
Error: Connection failed
Ex 1.3: Match exact IP
Solution
grep '10\.50\.1\.20' /tmp/ex01.txt
Note: Escape periods to match literally, not "any character"
Exercise Set 2: The Dot Trap
cat << 'EOF' > /tmp/ex02.txt
192.168.1.100
192X168Y1Z100
10.50.1.20
version1.2.3
versionABC
EOF
Ex 2.1: Why does this match both?
grep -E '192.168' /tmp/ex02.txt
Explanation
The unescaped . matches ANY character:
- 192.168 matches "192.168" (dot is any char, happens to be period)
- 192.168 also matches "192X168" (dot matches X)
This is the #1 regex bug. Always escape dots in IPs, domains, versions.
Ex 2.2: Match only valid IP format
Solution
grep -E '192\.168\.1\.100' /tmp/ex02.txt
Exercise Set 3: Escaping Special Characters
cat << 'EOF' > /tmp/ex03.txt
[ERROR] Database offline
[WARN] Low memory
Price: $99.99
Path: C:\Users\Admin
What? Really? Yes!
2+2=4
5*5=25
EOF
Ex 3.1: Match [ERROR] including brackets
Solution
grep -E '\[ERROR\]' /tmp/ex03.txt
Ex 3.2: Match the $99.99 price
Solution
grep -E '\$99\.99' /tmp/ex03.txt
Ex 3.3: Match Windows path
Solution
grep -E 'C:\\Users' /tmp/ex03.txt
Note: In shell, \\ becomes one backslash passed to grep
Ex 3.4: Match "2+2=4"
Solution
grep -E '2\+2=4' /tmp/ex03.txt
Real-World Applications
Professional: Finding Log Levels
# Match [INFO], [WARN], [ERROR] in logs
grep -E '\[(INFO|WARN|ERROR)\]' /var/log/app.log
Professional: ISE Authentication Logs
# Match Passed-Authentication or Failed-Attempt
grep -E 'Passed-Authentication|Failed-Attempt' /var/log/ise-psc.log
Personal: Find Prices in Documents
# Find any dollar amounts
grep -Eo '\$[0-9,]+(\.[0-9]{2})?' ~/Documents/receipts/*.txt
Personal: Search for URLs
# Find URLs (escape the dots and slashes)
grep -Eo 'https?://[a-zA-Z0-9\./\-]+' ~/notes/*.md
Tool Variants
sed: Escape for Substitution
# Replace [ERROR] with [CRITICAL]
echo '[ERROR] System failure' | sed 's/\[ERROR\]/[CRITICAL]/'
awk: Double Escaping
# In awk, backslashes need extra escaping
echo '192.168.1.100' | awk '/192\.168\.1\.100/ {print "Found IP"}'
vim: Same Escaping Rules
# Search for [ERROR] /\[ERROR\] # Substitute $99 with $49 :%s/\$99/\$49/g
Python: Raw Strings Help
import re
# Use r'' raw strings to avoid double-escaping
pattern = r'\[ERROR\]'
text = '[ERROR] Connection failed'
if re.search(pattern, text):
print('Found error')
Key Takeaways
| Concept | Remember |
|---|---|
Dot |
Matches ANY character. Always escape in IPs, domains, filenames. |
Brackets |
Define character classes. Escape with |
Dollar |
End anchor. Escape with |
Question |
Optional quantifier. Escape with |
Backslash |
Escape character. Use |
Common Mistakes
| Mistake | Wrong | Correct |
|---|---|---|
Unescaped dot in IP |
|
|
Unescaped brackets |
|
|
Forgetting shell quoting |
|
|
Windows paths |
|
|
Self-Test
Answer these without looking:
-
What does
.match in regex? -
List 5 metacharacters that need escaping
-
How do you match a literal backslash?
-
What flag makes matching case-insensitive?
-
Why is
192.168.1.1a BAD pattern for matching IPs?
Answers
-
Any single character except newline
-
. * + ? ^ $ | ( ) [ ] { } \(any 5) -
\\ -
-iflag in grep,re.IGNORECASEin Python -
The dots match ANY character, so it would also match "192X168Y1Z1"
Next Drill
Drill 02: Character Classes - Learn [abc], [0-9], and PCRE shorthand.