sort — Ordering & Deduplication

Sorting patterns for pipeline work — numeric/field-based/version sort, deduplication, and the sort | uniq -c | sort -rn frequency count pipeline.

Sorting Patterns

Numeric sort — largest first
find docs/modules/ROOT/examples/codex -mindepth 1 -maxdepth 1 -type d -exec sh -c \
    'echo "$(find "$1" -type f | wc -l) $1"' _ {} \; | sort -rn | head -5
# Output:
# 77 docs/modules/ROOT/examples/codex/powershell
# 29 docs/modules/ROOT/examples/codex/bash
# 27 docs/modules/ROOT/examples/codex/text
# 16 docs/modules/ROOT/examples/codex/vim
# 13 docs/modules/ROOT/examples/codex/linux
Field-based sort — sort by specific column
# Sort /etc/passwd by UID (field 3, numeric)
sort -t: -k3 -n /etc/passwd | head -5
# -t: = colon delimiter
# -k3 = sort on field 3
# -n  = numeric comparison
Sort by multiple keys
# Sort by field 2 (numeric descending), then field 1 (alphabetic)
sort -t, -k2 -rn -k1 data.csv
Unique sort — deduplicate while sorting
sort -u file.txt
# Equivalent to sort | uniq but in one pass
Version sort — natural number ordering
printf "v1.10\nv1.2\nv1.1\nv2.0\n" | sort -V
# Output: v1.1, v1.2, v1.10, v2.0
# Without -V: v1.1, v1.10, v1.2, v2.0 (lexicographic — wrong)
Stable sort — preserve original order for equal keys
sort -s -k2 file.txt
# -s = stable: lines with equal keys keep their original order
Sort + uniq -c — frequency count pipeline
find docs/modules/ROOT/pages -name '*.adoc' -type f | awk -F/ '{print $5}' | sort | uniq -c | sort -rn | head -5
# Output:
# 819 education
# 413 projects
# 147 competencies
# 139 codex
# 106 2026

Gotchas

Lexicographic vs numeric — the classic trap
# WRONG — lexicographic sort puts "9" after "80"
echo -e "9\n80\n100\n5" | sort
# Output: 100, 5, 80, 9 (string comparison!)

# CORRECT — numeric sort
echo -e "9\n80\n100\n5" | sort -n
# Output: 5, 9, 80, 100
Locale affects sort order
# LC_ALL=C gives predictable ASCII sort order
# Locale-aware sort may fold case or ignore punctuation
LC_ALL=C sort file.txt           # byte-value sort (fast, predictable)
sort file.txt                     # locale-aware (may surprise you)
uniq requires sorted input
# WRONG — uniq only removes ADJACENT duplicates
echo -e "a\nb\na" | uniq         # a, b, a (not deduplicated!)

# CORRECT — sort first
echo -e "a\nb\na" | sort | uniq  # a, b

See Also

  • cut — field extraction before sorting

  • awk — when you need logic with your sorting

  • grep — filter before sorting