Drill 04: Anchors & Boundaries

Anchors match positions, not characters. Use them to find patterns at line starts, line ends, or word boundaries without consuming any text.

Core Concepts

Anchor Meaning Tool Support

Anchor	Meaning	Tool Support
`^`	Start of line/string	All tools
`$`	End of line/string	All tools
`\b`	Word boundary	PCRE (grep -P), Python
`\B`	NOT a word boundary	PCRE, Python
`\<`	Start of word	BRE/ERE (grep, sed)
`\>`	End of word	BRE/ERE (grep, sed)
`\A`	Start of string (ignores multiline)	PCRE, Python
`\Z`	End of string (ignores multiline)	PCRE, Python

^

Start of line/string

All tools

$

End of line/string

All tools

\b

Word boundary

PCRE (grep -P), Python

\B

NOT a word boundary

PCRE, Python

\<

Start of word

BRE/ERE (grep, sed)

\>

End of word

BRE/ERE (grep, sed)

\A

Start of string (ignores multiline)

PCRE, Python

\Z

End of string (ignores multiline)

PCRE, Python

Zero-Width Assertion

Key insight: Anchors match POSITIONS, not characters. They have zero width.

echo "hello" | grep -o '^'
# Output: (empty - matched position before 'h')

echo "hello" | grep -o '^.'
# Output: h (position before h, then one character)

Interactive CLI Drill

bash ~/atelier/_bibliotheca/domus-captures/docs/modules/ROOT/examples/regex-drills/04-anchors.sh

Exercise Set 1: Line Anchors

cat << 'EOF' > /tmp/ex-anchor.txt
ERROR: Connection failed
Warning: Low disk
INFO: Process started
  ERROR: indented error
error: lowercase error
Process completed with ERROR
EOF

Ex 1.1: Lines starting with ERROR

Solution

grep '^ERROR' /tmp/ex-anchor.txt

Output: ERROR: Connection failed (Not the indented or lowercase ones)

Ex 1.2: Lines ending with ERROR

Solution

grep 'ERROR$' /tmp/ex-anchor.txt

Output: Process completed with ERROR

Ex 1.3: Lines with ERROR anywhere (no anchor)

Solution

grep 'ERROR' /tmp/ex-anchor.txt

Output: All lines containing ERROR (3 lines)

Ex 1.4: Empty lines only

Solution

grep '^$' file.txt

^$ = start immediately followed by end = empty line.

Exercise Set 2: Word Boundaries

cat << 'EOF' > /tmp/ex-words.txt
port 80
port 443
port 8080
transport layer
export PATH
import os
reported error
portability issues
sportsman
EOF

Ex 2.1: Match "port" as whole word only

Solution

# PCRE with \b
grep -P '\bport\b' /tmp/ex-words.txt

# BRE/ERE with \< \>
grep '\<port\>' /tmp/ex-words.txt

Output: port 80, port 443, port 8080 (NOT transport, export, reported, portability, sportsman)

Ex 2.2: Words starting with "port"

Solution

grep -P '\bport' /tmp/ex-words.txt
# Or: grep '\<port' /tmp/ex-words.txt

Output: port 80, port 443, port 8080, portability

Ex 2.3: Words ending with "port"

Solution

grep -P 'port\b' /tmp/ex-words.txt
# Or: grep 'port\>' /tmp/ex-words.txt

Output: port 80, port 443, port 8080, transport, export

Ex 2.4: "port" NOT as whole word

Solution

# Using \B (NOT word boundary)
grep -P '\Bport\B' /tmp/ex-words.txt

Output: Lines where "port" is inside a word (reported, sportsman)

Exercise Set 3: Combined Patterns

cat << 'EOF' > /tmp/ex-combined.txt
192.168.1.100
host: server-01
HOST: server-02
Server: 10.50.1.50
# This is a comment
  # Indented comment
key=value
key = value
EOF

Ex 3.1: Lines starting with IP address

Solution

grep -E '^[0-9]{1,3}\.' /tmp/ex-combined.txt

Output: 192.168.1.100

Ex 3.2: Comment lines (# at start, with optional spaces)

Solution

grep -E '^ *#' /tmp/ex-combined.txt

^ *# = start, zero or more spaces, then #

Ex 3.3: Key-value pairs (key at line start)

Solution

grep -E '^[a-z]+\s*=' /tmp/ex-combined.txt

Output: key=value, key = value

Ex 3.4: Lines NOT starting with #

Solution

grep -v '^#' /tmp/ex-combined.txt
# Or with pattern: grep -E '^[^#]' /tmp/ex-combined.txt

Exercise Set 4: Multiline Context

Ex 4.1: Whole line match

Solution

# Match exact line "key=value"
grep -x 'key=value' /tmp/ex-combined.txt
# -x is equivalent to: grep '^key=value$' /tmp/ex-combined.txt

Ex 4.2: Python multiline mode

Solution

import re

text = """Line one
Line two
Line three"""

# Without MULTILINE: ^ and $ match string start/end
pattern = re.compile(r'^Line')
print(pattern.findall(text))  # ['Line'] - only first

# With MULTILINE: ^ and $ match line start/end
pattern = re.compile(r'^Line', re.MULTILINE)
print(pattern.findall(text))  # ['Line', 'Line', 'Line']

Real-World Applications

Professional: Find Config Directives

# Apache/Nginx directives at line start
grep -E '^(Listen|ServerName|root|server_name)' /etc/nginx/nginx.conf

# SSH config options
grep -E '^\s*(PermitRootLogin|PasswordAuthentication)' /etc/ssh/sshd_config

Professional: Log Analysis

# Lines starting with timestamp
grep -E '^[0-9]{4}-[0-9]{2}-[0-9]{2}' /var/log/app.log

# Error lines at start of entry
grep -E '^\[ERROR\]' /var/log/app.log

# Lines ending with status codes
grep -E '(200|404|500)$' /var/log/nginx/access.log

Professional: ISE Patterns

# Lines starting with MAC address
grep -Ei '^[0-9a-f]{2}:' /var/log/ise-psc.log

# Match "Passed" as word (not PassedAuthentication)
grep -P '\bPassed\b' /var/log/ise-psc.log

Personal: Note Searching

# Find TODO items at line start
grep -ri '^TODO:' ~/notes/

# Find headings (markdown)
grep -E '^#{1,3} ' ~/notes/*.md

# Find AsciiDoc section titles
grep -E '^=+ ' ~/docs/*.adoc

Personal: List Items

# Bullet points
grep -E '^[*-] ' ~/notes/*.md

# Numbered items
grep -E '^[0-9]+\. ' ~/notes/*.md

# Checkbox items
grep -E '^\s*\[ \]' ~/notes/*.md

Personal: Journal Entries

# Date headers
grep -E '^[0-9]{4}-[0-9]{2}-[0-9]{2}' ~/journal/*.md

# Time entries
grep -E '^[0-9]{2}:[0-9]{2}' ~/journal/*.md

Tool Variants

grep: Anchor Usage

# Case-insensitive whole word
grep -wi 'error' file.txt  # -w is like \b...\b

# Invert match: lines NOT starting with #
grep -v '^#' config.txt

# Count lines starting with pattern
grep -c '^ERROR' file.txt

sed: Anchored Substitution

# Remove leading whitespace
sed 's/^[[:space:]]*//' file.txt

# Remove trailing whitespace
sed 's/[[:space:]]*$//' file.txt

# Add prefix to each line
sed 's/^/PREFIX: /' file.txt

# Add suffix to each line
sed 's/$/ # comment/' file.txt

# Comment out lines starting with keyword
sed 's/^DEBUG/# DEBUG/' file.txt

awk: Line Position

# Lines starting with pattern
awk '/^ERROR/' file.txt

# Lines ending with pattern
awk '/failed$/' file.txt

# Word boundaries (GNU awk)
awk '/\<port\>/' file.txt

# Anchor with field check
awk '$1 ~ /^[0-9]/' file.txt  # Field 1 starts with digit

vim: Anchor Patterns

" Find lines starting with ERROR
/^ERROR

" Find lines ending with semicolon
/;$

" Find word "port" (not transport)
/\<port\>

" Delete empty lines
:g/^$/d

" Delete lines starting with #
:g/^#/d

" Add text at line end
:%s/$/ # end/

Python: Anchors and Flags

import re

text = """ERROR: First line
INFO: Second line
ERROR: Third line"""

# Default: ^ matches string start only
pattern = re.compile(r'^ERROR')
matches = pattern.findall(text)
print(matches)  # ['ERROR'] - only first

# MULTILINE: ^ matches each line start
pattern = re.compile(r'^ERROR', re.MULTILINE)
matches = pattern.findall(text)
print(matches)  # ['ERROR', 'ERROR']

# Word boundaries
text = "port export transport"
pattern = re.compile(r'\bport\b')
matches = pattern.findall(text)
print(matches)  # ['port'] - only whole word

Gotchas

^ Inside Character Class

# ^ at START of class = negation
echo "abc123" | grep -o '[^0-9]+'
# Output: abc (NOT digits)

# ^ elsewhere = literal caret
echo "a^b" | grep -o '[a^b]'
# Output: a, ^, b (matches caret literally)

$ in Shell Strings

# WRONG: Shell interprets $
grep "pattern$" file.txt  # $ might be expanded

# CORRECT: Use single quotes
grep 'pattern$' file.txt

# Or escape it
grep "pattern\$" file.txt

Word Boundary Definition

# Word = [A-Za-z0-9_] sequence
# Boundary = transition between word and non-word

echo "user_name" | grep -Po '\buser\b'
# No match! Underscore is a word character

echo "user-name" | grep -Po '\buser\b'
# Match! Hyphen is NOT a word character

BRE vs PCRE Word Boundaries

# BRE/ERE: Use \< and \>
grep '\<word\>' file.txt

# PCRE: Use \b
grep -P '\bword\b' file.txt

# Both work, but \b is more portable to Python/JavaScript

Key Takeaways

Anchor Use Case

Anchor	Use Case
`^`	Match at line/string start
`$`	Match at line/string end
`^…$`	Match entire line exactly
`\b`	Word boundary (PCRE)
`\<` `\>`	Word boundaries (BRE/ERE)
`\B`	NOT a word boundary
`-w` (grep)	Shortcut for `\b…\b`
`re.MULTILINE`	Make ^ and $ match line boundaries

^

Match at line/string start

$

Match at line/string end

^…$

Match entire line exactly

\b

Word boundary (PCRE)

\< \>

Word boundaries (BRE/ERE)

\B

NOT a word boundary

-w (grep)

Shortcut for \b…\b

re.MULTILINE

Make ^ and $ match line boundaries

Self-Test

What does ^ERROR$ match?
What’s the difference between ^[] and ^?
How do you match "port" but not "export" or "transport"?
What grep flag is equivalent to \b…\b?
In Python, what flag makes ^ match line starts?

Answers

A line containing ONLY "ERROR" (nothing else)
^[] = lines NOT starting with #; ^ = lines starting with #
\bport\b (PCRE) or \<port\> (BRE/ERE) or grep -w port
-w (word match)
re.MULTILINE or re.M

Next Drill

Drill 05: Groups & Backreferences - Master (), \1, (?:), and named captures.