Competencies: CLI & Terminal > Text Processing
Text Processing
Body of Knowledge
| Topic | Description | Relevance | Career Tracks |
|---|---|---|---|
grep Fundamentals |
Pattern matching, basic vs extended regex (-E), fixed strings (-F), recursive (-r), context lines (-A/-B/-C). |
Critical |
All Engineering |
grep PCRE |
Perl-compatible regex (-P), lookaheads/lookbehinds, non-greedy quantifiers, named groups, advanced patterns. |
High |
SRE, DevOps, Data Engineer |
ripgrep (rg) |
Modern grep replacement, automatic gitignore respect, parallel search, type filtering, PCRE2 support. |
High |
Developer, DevOps, SRE |
awk Fundamentals |
Field-based processing, $0/$1/$NF, FS/OFS, pattern-action pairs, print vs printf, numeric operations. |
High |
SRE, DevOps, Data Engineer |
awk Advanced |
BEGIN/END blocks, associative arrays, multi-file processing (FNR/NR), state machines, custom functions. |
High |
SRE, Data Engineer |
sed Fundamentals |
Stream editing, substitution (s///), addresses, in-place editing (-i), delete/print commands. |
High |
SRE, DevOps, Automation |
sed Advanced |
Hold space/pattern space, multi-line editing, address ranges, branching, transliteration (y///). |
Medium |
SRE, Advanced Automation |
cut/paste/join |
Column extraction (cut), horizontal concatenation (paste), relational join by key, delimiter handling. |
Medium |
Data Processing, SRE |
sort/uniq |
Sorting (numeric, key, reverse), deduplication, counting (-c), set operations with sort | uniq. |
High |
All Engineering |
tr |
Character translation, character class deletion, squeeze repeats, case conversion, newline handling. |
Medium |
All Engineering |
diff/patch |
File comparison, unified diff format, patch application, three-way merge, directory comparison. |
High |
Developer, DevOps |
Regular Expressions |
Regex fundamentals, character classes, quantifiers, anchors, groups, backreferences, POSIX vs PCRE. |
Critical |
All Engineering |
Personal Status
| Topic | Level | Evidence | Active Projects | Gaps |
|---|---|---|---|---|
awk |
Advanced |
Field extraction, BEGIN/END blocks, associative arrays, pattern ranges (NR>=X,NR⇐Y), printf formatting, state machines, FNR/NR multi-file processing; daily driver for data extraction replacing grep+cut pipelines |
No awk for binary data processing, no gawk extensions (networking, XML) |
|
sed |
Advanced |
In-place editing (-i), line addressing, full-line replacement, append/insert, verify-before/apply/verify-after pattern; used for config file automation across lab infrastructure |
No hold buffer mastery (h/H/g/G/x), no multi-line sed for complex transforms |
|
grep / PCRE |
Advanced |
PCRE (-P) with lookaheads/lookbehinds, context flags (-A/-B/-C), recursive with glob (--include), files-only (-rl), quiet test (-q in conditionals); primary search tool before switching to awk |
Lookahead/lookbehind not yet second nature; regex curriculum (10 modules) in progress |