shell-mastery — Data Manipulation

Data Manipulation Patterns (competition scenarios)

PATH Manipulation

View PATH one entry per line
echo $PATH | tr : '\n'
Find duplicate PATH entries
echo $PATH | tr : '\n' | sort | uniq -d
Count duplicates
echo $PATH | tr : '\n' | sort | uniq -c | sort -rn
Deduplicate PATH preserving order (awk)
echo $PATH | awk -v RS=: '!seen[$0]++'
Deduplicate and reassemble
export PATH=$(echo $PATH | awk -v RS=: -v ORS=: '!seen[$0]++' | sed 's/:$//')
Check if a directory is in PATH
echo $PATH | tr : '\n' | grep -q '/usr/local/bin' && echo "YES" || echo "NO"

Text/Data Transformation

CSV to columns
column -t -s, data.csv
Sort CSV by 3rd field
sort -t, -k3 -rn data.csv
Extract unique values from a column
awk -F, '{print $2}' data.csv | sort -u
Count occurrences per value (frequency table)
awk -F, '{print $2}' data.csv | sort | uniq -c | sort -rn
Transpose rows to columns
paste -sd'\t' file.txt
Join two files on a key (like SQL JOIN)
join -t, <(sort -t, -k1 file1.csv) <(sort -t, -k1 file2.csv)
Convert key=value to JSON
awk -F= '{printf "\"%s\":\"%s\",", $1, $2}' config.env | sed 's/^/{/; s/,$/}/'
Top 10 largest files
find / -type f -exec du -h {} + 2>/dev/null | sort -rh | head -10
Disk usage by directory (sorted)
du -sh /var/*/ 2>/dev/null | sort -rh

Environment Inspection

All environment variables sorted
env | sort
Find variables matching a pattern
env | grep -i proxy
compgen -v | grep -i path
Show all shell options
# bash
shopt

# zsh
setopt
Diff two environments (before/after sourcing)
diff <(env | sort) <(source script.sh; env | sort)