Competencies: CLI & Terminal > Text Processing

Text Processing

Body of Knowledge

Topic Description Relevance Career Tracks

grep Fundamentals

Pattern matching, basic vs extended regex (-E), fixed strings (-F), recursive (-r), context lines (-A/-B/-C).

Critical

All Engineering

grep PCRE

Perl-compatible regex (-P), lookaheads/lookbehinds, non-greedy quantifiers, named groups, advanced patterns.

High

SRE, DevOps, Data Engineer

ripgrep (rg)

Modern grep replacement, automatic gitignore respect, parallel search, type filtering, PCRE2 support.

High

Developer, DevOps, SRE

awk Fundamentals

Field-based processing, $0/$1/$NF, FS/OFS, pattern-action pairs, print vs printf, numeric operations.

High

SRE, DevOps, Data Engineer

awk Advanced

BEGIN/END blocks, associative arrays, multi-file processing (FNR/NR), state machines, custom functions.

High

SRE, Data Engineer

sed Fundamentals

Stream editing, substitution (s///), addresses, in-place editing (-i), delete/print commands.

High

SRE, DevOps, Automation

sed Advanced

Hold space/pattern space, multi-line editing, address ranges, branching, transliteration (y///).

Medium

SRE, Advanced Automation

cut/paste/join

Column extraction (cut), horizontal concatenation (paste), relational join by key, delimiter handling.

Medium

Data Processing, SRE

sort/uniq

Sorting (numeric, key, reverse), deduplication, counting (-c), set operations with sort | uniq.

High

All Engineering

tr

Character translation, character class deletion, squeeze repeats, case conversion, newline handling.

Medium

All Engineering

diff/patch

File comparison, unified diff format, patch application, three-way merge, directory comparison.

High

Developer, DevOps

Regular Expressions

Regex fundamentals, character classes, quantifiers, anchors, groups, backreferences, POSIX vs PCRE.

Critical

All Engineering

Personal Status

Topic Level Evidence Active Projects Gaps

awk

Advanced

Field extraction, BEGIN/END blocks, associative arrays, pattern ranges (NR>=X,NR⇐Y), printf formatting, state machines, FNR/NR multi-file processing; daily driver for data extraction replacing grep+cut pipelines

awk Reference, CLI Mastery Path

No awk for binary data processing, no gawk extensions (networking, XML)

sed

Advanced

In-place editing (-i), line addressing, full-line replacement, append/insert, verify-before/apply/verify-after pattern; used for config file automation across lab infrastructure

sed Reference, CLI Mastery Path

No hold buffer mastery (h/H/g/G/x), no multi-line sed for complex transforms

grep / PCRE

Advanced

PCRE (-P) with lookaheads/lookbehinds, context flags (-A/-B/-C), recursive with glob (--include), files-only (-rl), quiet test (-q in conditionals); primary search tool before switching to awk

grep Reference, CLI Mastery Path

Lookahead/lookbehind not yet second nature; regex curriculum (10 modules) in progress