Regex Gotchas: The Traps Everyone Falls Into

These are the mistakes EVERYONE makes. Learning to spot them instantly separates beginners from intermediate users.

You just hit one of these (escaping brackets instead of dots). This page drills these traps until they’re automatic.

Setup

cat << 'EOF' > /tmp/gotchas.txt
192.168.1.100
192X168Y1Z100
10.50.1.20
10X50X1X20
.hidden-file
file.txt
C:\Users\Admin
/home/evan
$99.99
$$VARIABLE$$
Price: $100
[ERROR] failed
[WARN] warning
{braces}
|pipe|
^caret
end$
question?
plus+
star*
(parens)
the the duplicate
word word here
color colour
gray grey
EOF

TRAP 1: Unescaped Dot

The . matches ANY character, not just a literal period.

The Bug

# WRONG - matches 192X168Y1Z100 too!
grep '192.168.1.100' /tmp/gotchas.txt

Try it:

Output
192.168.1.100
192X168Y1Z100

Both match because . matches X, Y, Z too!

The Fix

# CORRECT - escape the dots
grep '192\.168\.1\.100' /tmp/gotchas.txt
Output
192.168.1.100

Only the actual IP matches now.

Practice

Match 10.50.1.20 exactly (not 10X50X1X20):

Answer
grep '10\.50\.1\.20' /tmp/gotchas.txt

TRAP 2: Wrong Escape Target

You escaped [ when you meant to escape .

The Bug

# WRONG - escaping the bracket, not the dot
grep '[0-9]*.\[0-9]*' /tmp/gotchas.txt

The \[ escapes the bracket. The . is STILL matching any character.

The Fix

# CORRECT - escape the DOT
grep '[0-9]*\.[0-9]*' /tmp/gotchas.txt

Rule

When matching IP addresses: - Escape: \. (the period) - Don’t escape: [0-9] (the character class brackets)


TRAP 3: * Matches Zero (Empty String)

* means "zero or more" - it happily matches nothing.

The Bug

# WRONG - [0-9]* can match empty string
grep -E '^[0-9]*$' /tmp/gotchas.txt

This matches empty lines too because * allows zero digits.

The Fix

# CORRECT - use + for "one or more"
grep -E '^[0-9]+$' /tmp/gotchas.txt

Comparison

Quantifier Meaning Matches ""?

*

Zero or more

YES

+

One or more

NO

?

Zero or one

YES

{1,3}

One to three

NO


TRAP 4: Greedy Matching

Default quantifiers grab as MUCH as possible.

The Bug

# WRONG - greedy .* grabs everything
echo '<div>one</div><div>two</div>' | grep -oP '<div>.*</div>'
Output
<div>one</div><div>two</div>

One match containing BOTH divs!

The Fix

# CORRECT - lazy .*? matches minimum
echo '<div>one</div><div>two</div>' | grep -oP '<div>.*?</div>'
Output
<div>one</div>
<div>two</div>

Two separate matches.

Alternative Fix (No PCRE)

# Use negated character class instead
echo '<div>one</div><div>two</div>' | grep -oE '<div>[^<]*</div>'

[^<]* = any character EXCEPT <, so it stops at </div>


TRAP 5: $ Means End of Line

In regex, $ anchors to end of line, not a dollar sign.

The Bug

# WRONG - matches lines ENDING with "99"
grep '$99' /tmp/gotchas.txt

This finds lines ending with 99, not the literal $99.

The Fix

# CORRECT - escape the dollar sign
grep '\$99' /tmp/gotchas.txt

Practice

Match $100 literally:

Answer
grep '\$100' /tmp/gotchas.txt

TRAP 6: ^ Inside vs Outside []

^ has two meanings depending on context.

Outside []: Start of Line

grep '^Price' /tmp/gotchas.txt  # Lines starting with "Price"

Inside [] at Start: Negation

grep '[^0-9]' /tmp/gotchas.txt  # Any character EXCEPT digits

Inside [] NOT at Start: Literal

grep '[a-z^]' /tmp/gotchas.txt  # Lowercase OR literal ^

TRAP 7: BRE vs ERE Escaping

Some characters need escaping in BRE but not ERE.

Character BRE (grep) ERE (grep -E)

+

\+ (escaped)

+ (plain)

?

\? (escaped)

? (plain)

{n,m}

\{n,m\} (escaped)

{n,m} (plain)

()

\(\) (escaped)

() (plain)

|

| (escaped)

| (plain)

The Bug

# WRONG - + needs escaping in BRE
grep '[0-9]+' /tmp/gotchas.txt

This looks for literal + character, not "one or more digits".

The Fixes

# Option 1: BRE with escaped +
grep '[0-9]\+' /tmp/gotchas.txt

# Option 2: ERE (preferred)
grep -E '[0-9]+' /tmp/gotchas.txt
Always use -E for modern regex work. BRE escaping is confusing.

TRAP 8: \b Only Works in PCRE

Word boundary \b requires PCRE (-P).

The Bug

# WRONG - \b doesn't work in ERE
grep -E '\bword\b' /tmp/gotchas.txt

This doesn’t match "word" as a whole word.

The Fix

# CORRECT - use PCRE for \b
grep -P '\bword\b' /tmp/gotchas.txt

# Or use -w flag
grep -w 'word' /tmp/gotchas.txt

TRAP 9: Shell Expansion

Single quotes protect regex from shell.

The Bug

# WRONG - shell expands $VARIABLE
grep "$100" /tmp/gotchas.txt

Shell sees $100 as variable $1 followed by 00.

The Fix

# CORRECT - single quotes protect from shell
grep '\$100' /tmp/gotchas.txt

Rule

Always use single quotes for regex unless you need shell variable expansion.


TRAP 10: Forgetting -o for Extraction

Without -o, you get the whole line, not just the match.

The Bug

# Returns whole lines containing IPs
grep -E '[0-9]+\.[0-9]+\.[0-9]+\.[0-9]+' /tmp/gotchas.txt

The Fix

# Returns ONLY the matched IP
grep -oE '[0-9]+\.[0-9]+\.[0-9]+\.[0-9]+' /tmp/gotchas.txt

Quick Diagnostic

When your regex doesn’t work, check these IN ORDER:

  1. Unescaped metacharacters? (. * + ? $ ^ [ ] ( ) { } | \)

  2. Wrong flavor? (BRE vs ERE vs PCRE)

  3. Greedy trap? (need .? instead of .)

  4. Shell expansion? (use single quotes)

  5. Need -o? (for extraction)

  6. Need -P? (for \b, \d, lookaround)


Drill: Find the Bug

For each broken command, identify the problem:

Bug 1

grep '192.168' file    # Why does this match "192X168"?
Answer

Unescaped dot. Fix: grep '192\.168' file

Bug 2

grep '[0-9]+' file     # Why doesn't this match digits?
Answer

BRE mode - ` is literal. Fix: `grep -E '[0-9]' file or grep '[0-9]\+' file

Bug 3

grep '\berror\b' file  # Why doesn't word boundary work?
Answer

\b only works in PCRE. Fix: grep -P '\berror\b' file or grep -w 'error' file

Bug 4

grep "Price: $99" file  # Why doesn't this match?
Answer

Double quotes allow shell expansion. $99 becomes $9 + 9. Fix: grep 'Price: \$99' file

Bug 5

grep '<.*>' file       # Why does this return too much?
Answer

Greedy matching. . grabs everything. Fix: grep -P '<.?>' file or grep '<[^>]*>' file


Your IP Mistake Dissected

What you wrote:

grep '[0-9]*.\[0-9]*.\[0-9]*.\[0-9]*' /tmp/fundamentals.txt

Problems:

Issue What You Wrote What It Does

Unescaped dot

.

Matches ANY character

Wrong escape

\[

Escapes the bracket (useless)

* greedy

[0-9]*

Matches zero or more (empty OK)

Fix progression:

# Level 1: Escape the dots
grep '[0-9]*\.[0-9]*\.[0-9]*\.[0-9]*' /tmp/fundamentals.txt

# Level 2: Use + instead of *
grep '[0-9]\+\.[0-9]\+\.[0-9]\+\.[0-9]\+' /tmp/fundamentals.txt

# Level 3: ERE syntax
grep -E '[0-9]+\.[0-9]+\.[0-9]+\.[0-9]+' /tmp/fundamentals.txt

# Level 4: Limit octet length
grep -E '[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}' /tmp/fundamentals.txt

# Level 5: PCRE with \d shortcut
grep -oP '\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}' /tmp/fundamentals.txt

You’re on day 2. This mistake is NORMAL. The fact that you caught it means you’re learning.