jq Session 05: Advanced

Data analysis with jq. This session covers sorting, grouping, deduplication, and aggregation patterns.

Pre-Session State

  • Can use if/then/else

  • Can handle nulls with //

  • Understand try/catch

Setup

cat > /tmp/jq-logs.json << 'EOF'
[
  {"ts": "2026-03-18T10:00:00", "host": "kvm-01", "level": "INFO", "msg": "started"},
  {"ts": "2026-03-18T10:01:00", "host": "kvm-01", "level": "WARN", "msg": "high cpu"},
  {"ts": "2026-03-18T10:02:00", "host": "kvm-02", "level": "ERROR", "msg": "disk full"},
  {"ts": "2026-03-18T10:03:00", "host": "kvm-01", "level": "ERROR", "msg": "oom"},
  {"ts": "2026-03-18T10:04:00", "host": "vault-01", "level": "INFO", "msg": "sealed"},
  {"ts": "2026-03-18T10:05:00", "host": "kvm-02", "level": "WARN", "msg": "high cpu"},
  {"ts": "2026-03-18T10:06:00", "host": "vault-01", "level": "ERROR", "msg": "unsealed"}
]
EOF

Lesson 1: Sorting

Concept: sort_by(.field) orders arrays by a field.

Exercise 1.1: Sort by field

cat /tmp/jq-logs.json | jq 'sort_by(.host)'

Output: Logs sorted alphabetically by hostname.

Exercise 1.2: Reverse sort

cat /tmp/jq-logs.json | jq 'sort_by(.ts) | reverse'

Output: Newest logs first.

Exercise 1.3: Sort by multiple fields

cat /tmp/jq-logs.json | jq 'sort_by(.host, .ts)'

Output: Sorted by host, then by timestamp within each host.

Lesson 2: Grouping

Concept: group_by(.field) creates arrays of items with same field value.

Exercise 2.1: Group by host

cat /tmp/jq-logs.json | jq 'group_by(.host)'

Output: Array of arrays, each inner array has logs from same host.

Exercise 2.2: Count per group

cat /tmp/jq-logs.json | jq 'group_by(.host) | map({host: .[0].host, count: length})'

Output: [{host: "kvm-01", count: 3}, …​]

Exercise 2.3: Group by level

cat /tmp/jq-logs.json | jq 'group_by(.level) | map({level: .[0].level, count: length})'

Output: Count of INFO, WARN, ERROR logs.

Lesson 3: Unique

Concept: unique and unique_by(.field) remove duplicates.

Exercise 3.1: Unique values

cat /tmp/jq-logs.json | jq '[.[].host] | unique'

Output: ["kvm-01", "kvm-02", "vault-01"]

Exercise 3.2: Unique by field

cat /tmp/jq-logs.json | jq 'unique_by(.host) | map(.host)'

Output: First occurrence of each host.

Lesson 4: Reduce and Aggregation

Concept: reduce and add for computing values across arrays.

Exercise 4.1: Count with add

cat /tmp/jq-logs.json | jq '[.[] | select(.level == "ERROR")] | length'

Output: 3 (count of errors)

Exercise 4.2: Sum values

echo '[{"v": 10}, {"v": 20}, {"v": 30}]' | jq '[.[].v] | add'

Output: 60

Exercise 4.3: Reduce pattern

cat /tmp/jq-logs.json | jq 'reduce .[] as $log ({};
  .[$log.level] = (.[$log.level] // 0) + 1)'

Output: {"INFO": 2, "WARN": 2, "ERROR": 3}

Exercise 4.4: Min/Max

cat /tmp/jq-logs.json | jq 'min_by(.ts), max_by(.ts) | .ts'

Output: Earliest and latest timestamps.

Summary: What You Learned

Concept Syntax Example

Sort

sort_by(.field)

sort_by(.ts) chronological

Reverse

reverse

sort_by(.ts) | reverse

Group

group_by(.field)

group_by(.host) cluster by host

Unique

unique

[.[].x] | unique

Count

length

[select(…​)] | length

Sum

add

[.[].v] | add

Reduce

reduce .[] as $x (…​)

Build accumulators

Min/Max

min_by, max_by

min_by(.ts)

Exercises to Complete

  1. [ ] Get hosts with most errors (group, count, sort)

  2. [ ] Find the most recent log per host

  3. [ ] Create summary: {total: N, errors: N, warnings: N}

  4. [ ] List unique (host, level) combinations

Next Session

Session 06: Infrastructure - ISE, k8s, Vault, real-world patterns.

Session Log

Timestamp Notes

Start

<Record when you started>

End

<Record when you finished>