jq — JSON Processing
Advanced jq patterns for infrastructure work — select/map/test filtering, JSON construction from shell commands, CSV/TSV output, slurp mode for aggregation. For sysadmin-focused patterns, see jq Sysadmin.
Selection and Filtering
jq 'keys' package.json
# Output:
# [
# "author",
# "dependencies",
# "description",
# "license",
# "name",
# "private",
# "scripts",
# "version"
# ]
jq '.scripts' package.json
# Output:
# {
# "build": "antora antora-playbook.yml",
# "serve": "npm run build && cd build/site && python3 -m http.server 8000",
# "clean": "rm -rf build .cache"
# }
git log --oneline -10 --format='{"hash":"%h","msg":"%s","date":"%cs"}' | \
jq -s 'map(select(.msg | test("codex|std|project"))) | length'
# Output: 13 (commits matching codex/std/project in last 20)
git log --oneline -10 --format='{"hash":"%h","msg":"%s","date":"%cs"}' | \
jq -s 'map(select(.date == "2026-04-09")) | map(.msg) | .[:3]'
# Output:
# [
# "updated readme",
# "docs(security): add production QRadar codex...",
# "docs(codex): Add GitHub CLI (gh) patterns"
# ]
Construction and Output Formats
find docs/modules/ROOT/examples/codex -mindepth 1 -maxdepth 1 -type d | while read d; do
count=$(find "$d" -type f | wc -l)
echo "{\"category\": \"$(basename $d)\", \"files\": $count}"
done | jq -s 'sort_by(.files) | reverse | .[:5]'
# Output:
# [
# {"category": "powershell", "files": 77},
# {"category": "bash", "files": 28},
# {"category": "text", "files": 27},
# {"category": "vim", "files": 16},
# {"category": "linux", "files": 13}
# ]
find docs/modules/ROOT/examples/codex -mindepth 1 -maxdepth 1 -type d | while read d; do
count=$(find "$d" -type f | wc -l)
echo "{\"category\":\"$(basename $d)\",\"files\":$count}"
done | jq -sr '.[] | [.category, (.files|tostring)] | @csv' | sort -t, -k2 -rn | head -5
# Output:
# "powershell","77"
# "bash","28"
# "text","27"
# "vim","16"
# "security","6"
jq -r '.scripts | to_entries[] | [.key, .value] | @tsv' package.json
# Output:
# build antora antora-playbook.yml
# serve npm run build && cd build/site && python3 -m http.server 8000
# clean rm -rf build .cache
git log --oneline -5 --format='{"hash": "%h", "subject": "%s", "date": "%ci"}' | jq -s '.'
# Output: array of 5 commit objects with hash, subject, date
Slurp and Stream Modes
# Without -s: processes each line independently
echo -e '{"a":1}\n{"a":2}' | jq '.a'
# Output: 1 / 2
# With -s: collects into array first
echo -e '{"a":1}\n{"a":2}' | jq -s 'map(.a) | add'
# Output: 3
git log --oneline -20 --format='{"hash":"%h","msg":"%s","date":"%cs"}' | \
jq -s 'sort_by(.date) | reverse | .[:5] | .[].msg'
# Most recent 5 commit messages, sorted by date
jq -r '.name' package.json
# Output: domus-captures (no quotes — ready for shell variables)
NAME=$(jq -r '.name' package.json)
echo "Building: $NAME"
jq -c '.scripts | to_entries[]' package.json
# Output:
# {"key":"build","value":"antora antora-playbook.yml"}
# {"key":"serve","value":"npm run build && cd build/site && python3 -m http.server 8000"}
# {"key":"clean","value":"rm -rf build .cache"}
Gotchas
echo '{"a":1}' | jq '.b'
# Output: null (no error — easy to miss)
# Guard with // (alternative operator)
echo '{"a":1}' | jq '.b // "MISSING"'
# Output: "MISSING"
# WRONG — concatenation doesn't work like this
echo '{"n":"grep"}' | jq '"tool: " + .n + " found"'
# CORRECT — string interpolation
echo '{"n":"grep"}' | jq '"tool: \(.n) found"'
# Output: "tool: grep found"
# WRONG — port values from antora.yml are strings
echo '{"port":"443"}' | jq '.port + 1'
# Error: string and number cannot be added
# CORRECT — convert with tonumber
echo '{"port":"443"}' | jq '(.port | tonumber) + 1'
# Output: 444
# WRONG — shell expands $1
jq ".events[] | {id: $1}" file.json
# CORRECT — single quotes protect jq expression
jq '.events[] | {id: .id}' file.json
# CORRECT — mix when you need shell variables
HOST="mail-01"
jq --arg h "$HOST" '.[] | select(.host == $h)' inventory.json
# WRONG — piping empty string
echo "" | jq '.'
# Error: parse error
# CORRECT — use -n for generated output
jq -n '{"empty": true}'
Infrastructure Queries
Level 1: Filter — select objects by condition
ip -j link | jq '[.[] | select(.operstate == "UP") | {name: .ifname, mac: .address}]'
# Output:
# [
# {"name": "wlan0", "mac": "e0:d5:5d:6c:e1:66"},
# {"name": "br-49799088587f", "mac": "96:1f:81:3a:d5:90"},
# ...
# ]
select() is jq’s WHERE clause — it keeps objects matching the condition and drops the rest.
Level 2: Enrich — extract nested structures
ip -j addr | jq '.[] | select(.addr_info | length > 0) | {
name: .ifname,
mac: .address,
state: .operstate,
ips: [.addr_info[] | select(.family == "inet") | .local]
}'
# Output:
# {"name": "wlan0", "mac": "e0:d5:5d:6c:e1:66", "state": "UP", "ips": ["10.50.10.126"]}
# {"name": "docker0", "mac": "e6:10:99:fb:93:f4", "state": "DOWN", "ips": ["172.17.0.1"]}
.addr_info[] unpacks the nested array. select(.family == "inet") filters to IPv4 only. The […] wrapper collects results back into an array.
Level 3: Classify — tag data with if/elif
ip -j link | jq '.[] | {
name: .ifname,
mac: .address,
role: (
if .ifname == "lo" then "loopback"
elif (.ifname | startswith("wlan")) then "wifi"
elif (.ifname | startswith("enp")) then "wired"
elif (.ifname | startswith("tailscale")) then "vpn"
elif (.ifname | startswith("docker")) or (.ifname | startswith("br-")) then "container"
elif (.ifname | startswith("veth")) then "container"
else "unknown"
end
)
}'
# Output:
# {"name": "wlan0", "mac": "e0:d5:5d:6c:e1:66", "role": "wifi"}
# {"name": "enp134s0", "mac": "a8:2b:dd:8f:23:e6", "role": "wired"}
# {"name": "tailscale0", "mac": null, "role": "vpn"}
if/elif/end inside jq works like a CASE statement. Parentheses around conditions are required.
Level 4: Aggregate — group_by and count
ip -j link | jq '[.[] | {
name: .ifname,
role: (
if .ifname == "lo" then "loopback"
elif (.ifname | startswith("wlan")) then "wifi"
elif (.ifname | startswith("enp")) then "wired"
elif (.ifname | startswith("tailscale")) then "vpn"
elif (.ifname | startswith("docker")) or (.ifname | startswith("br-")) then "container"
elif (.ifname | startswith("veth")) then "container"
else "unknown"
end
)
}] | group_by(.role) | map({role: .[0].role, count: length, interfaces: [.[].name]})'
# Output:
# [
# {"role": "container", "count": 4, "interfaces": ["br-49799088587f","docker0","veth8cde84a","veth52f3b68"]},
# {"role": "wifi", "count": 1, "interfaces": ["wlan0"]},
# {"role": "wired", "count": 1, "interfaces": ["enp134s0"]},
# {"role": "vpn", "count": 1, "interfaces": ["tailscale0"]}
# ]
group_by(.role) creates arrays of objects sharing the same role. map() transforms each group into a summary. This is jq’s GROUP BY + COUNT(*).
Level 5: Join two data sources — --argjson
ip -j link | jq --argjson addrs "$(ip -j addr)" '
[.[] | . as $link |
($addrs[] | select(.ifname == $link.ifname) | .addr_info // []) as $addrs_info |
{
name: .ifname,
mac: .address,
state: .operstate,
mtu: .mtu,
ips: [$addrs_info[] | select(.family == "inet") | {addr: .local, prefix: .prefixlen}],
flags: .flags
}
] | sort_by(.state)' > /tmp/network-inventory.json
# Joins link data with address data on ifname — like a SQL JOIN
# --argjson loads the second command's output as a jq variable
Level 6: Output formats — CSV, TSV, tables
# Interfaces with IPs
jq -r '.[] | select(.ips | length > 0) | "\(.name) → \(.ips[0].addr)/\(.ips[0].prefix)"' /tmp/network-inventory.json
# CSV export (for spreadsheets)
jq -r '.[] | [.name, .mac, .state, (.ips[0].addr // "none")] | @csv' /tmp/network-inventory.json
# TSV for column-formatted terminal display
jq -r '["NAME","MAC","STATE","IP"], (.[] | [.name, .mac, .state, (.ips[0].addr // "--")]) | @tsv' /tmp/network-inventory.json | column -t
# Output:
# NAME MAC STATE IP
# enp134s0 a8:2b:dd:8f:23:e6 DOWN --
# wlan0 e0:d5:5d:6c:e1:66 UP 10.50.10.126
# docker0 e6:10:99:fb:93:f4 DOWN 172.17.0.1
Level 7: Shell + jq — multi-source enrichment
ip -j link | jq -c '.[]' | while read -r line; do
name=$(echo "$line" | jq -r '.ifname')
mac=$(echo "$line" | jq -r '.address')
state=$(echo "$line" | jq -r '.operstate')
ip=$(ip -j addr show "$name" 2>/dev/null | jq -r '.[].addr_info[] | select(.family=="inet") | .local' 2>/dev/null | head -1)
uuid=$(nmcli -t -f DEVICE,UUID connection show --active 2>/dev/null | grep "^$name:" | cut -d: -f2)
printf "%-20s %-18s %-8s %-16s %s\n" "$name" "${mac:-none}" "$state" "${ip:---}" "${uuid:---}"
done
# Combines three data sources: ip link (MAC, state), ip addr (IPs), nmcli (UUIDs)
# UUIDs persist across adapter renames — use in scripts instead of interface names
The Pipeline: Query → Transform → Report
This is the pattern your management wants:
1. QUERY — ip -j, nmcli, curl (API), sql
→ produces raw JSON
2. TRANSFORM — jq select, group_by, map, join
→ normalize, classify, aggregate
3. EXPORT — @csv, @tsv, > file.json
→ feed to Python, spreadsheet, dashboard
4. VISUALIZE — matplotlib, plotly, d2 diagrams
→ charts, graphs, architecture diagrams
5. REPORT — AsciiDoc, PDF, HTML
→ consumable by leadership
Every API you work with (ISE, QRadar, Vault, Wazuh, Sentinel) returns JSON. jq is the universal transformer between the API response and whatever output format the audience needs.
See Also
-
jq Sysadmin — operational patterns
-
jq Text Processing — fundamentals
-
yq — YAML counterpart (jq-compatible syntax)
-
jq Favorites — curated one-liners