jq & yq Mastery

Overview

jq is a lightweight command-line JSON processor. Think of it as SQL for JSON - you query, filter, transform, and extract data from JSON streams.

yq extends this paradigm to YAML (and can convert between formats).

Master these tools. Every API returns JSON. Every config file is YAML. These are non-negotiable skills for infrastructure work.

Deep Dives Available:

Quick Reference

Operation jq Notes

Show all keys

jq 'keys'

Top-level keys only

Show nested keys

jq '.data | keys'

Keys of .data object

Keys with types

jq -r '.data | to_entries[] | "\(.key): \(.value | type)"'

Exploration pattern

Select all items

jq '.[]'

Like SELECT *

Filter (WHERE)

jq '.[] | select(.name == "value")'

Exact match

Filter (LIKE)

jq '.[] | select(.name | test("pattern"))'

Regex match

Project fields

jq '{name, id}'

Like SELECT name, id

Raw output

jq -r '.field'

No quotes, no colors

jq vs SQL Mental Model

SQL jq Example

SELECT * FROM table

.[]

Iterate all items

WHERE field = 'value'

select(.field == "value")

Exact match

WHERE field LIKE '%pattern%'

select(.field | test("pattern"))

Regex match

SELECT field1, field2

{field1, field2}

Project specific fields

SELECT field1 || ' ' || field2

"\(.field1) \(.field2)"

String interpolation

ORDER BY field

sort_by(.field)

Sorting

LIMIT 10

.[0:10]

Array slicing

COUNT(*)

length

Count items

Exploration Patterns

When encountering unknown JSON, explore systematically:

Step 1: What Type Is It?

curl -s "$API_URL" | jq 'type'
Expected Output
"object"   # or "array", "string", "number", "boolean", "null"

Step 2: What Keys Exist?

curl -s "$API_URL" | jq 'keys'
Expected Output
["data", "meta", "links"]

Step 3: Keys with Types (Power Pattern)

curl -s "$API_URL" | jq -r 'to_entries[] | "\(.key): \(.value | type)"'
Expected Output
data: object
meta: object
links: array

Step 4: Drill Into Nested Objects

curl -s "$API_URL" | jq -r '.data | to_entries[] | "\(.key): \(.value | type)"'
Expected Output
id: string
name: string
attributes: object
relationships: array

Step 5: Sample Values

curl -s "$API_URL" | jq '.data | to_entries[] | {key, value_sample: .value | tostring[0:50]}'

Essential Patterns

Array Operations

# First item
jq '.[0]' data.json

# Last item
jq '.[-1]' data.json

# First 5 items
jq '.[0:5]' data.json

# Length
jq 'length' data.json

# Map over array (transform each item)
jq '.[] | {name: .name, id: .id}' data.json

Filtering

# Exact match
jq '.[] | select(.status == "active")' data.json

# Not equal
jq '.[] | select(.status != "deleted")' data.json

# Regex match
jq '.[] | select(.name | test("vault"))' data.json

# Case-insensitive regex
jq '.[] | select(.name | test("vault"; "i"))' data.json

# Multiple conditions (AND)
jq '.[] | select(.status == "active" and .type == "server")' data.json

# Multiple conditions (OR)
jq '.[] | select(.status == "active" or .status == "pending")' data.json

# Contains (for strings)
jq '.[] | select(.name | contains("vault"))' data.json

# Has key
jq '.[] | select(has("metadata"))' data.json

# Null check
jq '.[] | select(.error == null)' data.json

Transformation

# Rename keys
jq '.[] | {server_name: .name, server_ip: .ip}' data.json

# Add computed field
jq '.[] | . + {full_name: "\(.first) \(.last)"}' data.json

# Delete key
jq 'del(.password)' data.json

# Update value
jq '.status = "modified"' data.json

# Conditional value
jq '.[] | .status_text = if .active then "ACTIVE" else "INACTIVE" end' data.json

Aggregation

# Count items
jq 'length' data.json

# Unique values
jq '[.[] | .status] | unique' data.json

# Group by field
jq 'group_by(.status)' data.json

# Count by group
jq 'group_by(.status) | map({status: .[0].status, count: length})' data.json

# Sum numeric field
jq '[.[] | .count] | add' data.json

# Min/Max
jq '[.[] | .count] | min' data.json
jq '[.[] | .count] | max' data.json

Real-World Examples

Vault API Responses

# List secret keys
vault kv list -format=json secret/ | jq -r '.[]'

# Extract certificate from Vault response
vault write -format=json pki/issue/role common_name="host.example.com" | jq -r '.data.certificate'

# Show all fields and types from PKI response
vault write -format=json pki/issue/role common_name="host.example.com" | \
  jq -r '.data | to_entries[] | "\(.key): \(.value | type)"'
Expected Output
ca_chain: array
certificate: string
expiration: number
issuing_ca: string
private_key: string
private_key_type: string
serial_number: string

ISE API (via netapi)

# List all policy sets
netapi ise --format json get-policy-sets | jq '.[] | {name, state, rank}'

# Find disabled policy sets
netapi ise --format json get-policy-sets | jq '.[] | select(.state == "disabled") | .name'

# Get authz profile with dACL name
netapi ise --format json get-authz-profiles | \
  jq '.[] | select(.name | test("Admin")) | {name, dacl: .daclName}'

pfSense DNS Overrides

# Find entries matching pattern
netapi pfsense dns list --format json | jq '.[] | select(.host | test("vault"))'

# Extract host and IP only
netapi pfsense dns list --format json | jq '.[] | "\(.host): \(.ip)"'

# Count entries per domain
netapi pfsense dns list --format json | jq 'group_by(.domain) | map({domain: .[0].domain, count: length})'

Git Forge CLIs (gh, glab, tea)

# GitHub: List private repos
gh repo list --limit 100 --json name,visibility | jq '.[] | select(.visibility=="PRIVATE")'

# GitHub: Get repo name for variable
REPO_NAME=$(gh repo view --json name -q '.name')

# GitHub: Repos with topics
gh repo list --json name,repositoryTopics | jq '.[] | select(.repositoryTopics | length > 0)'

# GitLab: List private repos
glab repo list --output json | jq '.[] | select(.visibility=="private")'

# GitLab: Repos with recent activity
glab repo list --output json | jq '.[] | select(.last_activity_at > "2026-01-01")'

# GitLab: Extract SSH URLs
glab repo list --output json | jq -r '.[] | .ssh_url_to_repo'

# tea (Gitea): List repos with details
tea repo list --output json | jq '.[] | {name, private, clone_url}'

# All forges: Extract repo names to array
gh repo list --json name -q '.[].name' > github_repos.txt
glab repo list --output json | jq -r '.[].path' > gitlab_repos.txt
tea repo list --output json | jq -r '.[].name' > gitea_repos.txt

Security Scanning (Dependabot / Vulnerability Reports)

# GitHub: List all Dependabot alerts
gh api repos/{owner}/{repo}/dependabot/alerts | jq '.[] | {package: .dependency.package.name, severity: .security_advisory.severity, summary: .security_advisory.summary}'

# GitHub: High severity only
gh api repos/{owner}/{repo}/dependabot/alerts | jq '.[] | select(.security_advisory.severity == "high") | {package: .dependency.package.name, file: .dependency.manifest_path, summary: .security_advisory.summary}'

# GitHub: Count by severity
gh api repos/{owner}/{repo}/dependabot/alerts | jq 'group_by(.security_advisory.severity) | map({severity: .[0].security_advisory.severity, count: length})'

# GitHub: Alerts by manifest file (find which package.json is the problem)
gh api repos/{owner}/{repo}/dependabot/alerts | jq 'group_by(.dependency.manifest_path) | map({file: .[0].dependency.manifest_path, count: length, packages: [.[] | .dependency.package.name] | unique})'

# GitHub: Open alerts with CVE IDs
gh api repos/{owner}/{repo}/dependabot/alerts | jq '.[] | select(.state == "open") | {cve: .security_advisory.cve_id, package: .dependency.package.name, severity: .security_advisory.severity}'

# GitLab: Project vulnerability report
glab api projects/:id/vulnerability_findings | jq '.[] | {severity: .severity, package: .location.dependency.package.name, scanner: .scanner.name}'

# GitLab: Critical/High vulnerabilities only
glab api projects/:id/vulnerability_findings | jq '.[] | select(.severity == "critical" or .severity == "high") | {name: .name, severity: .severity, state: .state}'

# GitLab: Count by scanner type
glab api projects/:id/vulnerability_findings | jq 'group_by(.scanner.name) | map({scanner: .[0].scanner.name, count: length})'
Gitea does not have built-in vulnerability scanning. Use GitHub/GitLab for security audits or integrate external tools like Trivy.

kubectl Output

# List pod names
kubectl get pods -o json | jq -r '.items[].metadata.name'

# Pods not running
kubectl get pods -o json | jq -r '.items[] | select(.status.phase != "Running") | .metadata.name'

# Container images in use
kubectl get pods -o json | jq -r '[.items[].spec.containers[].image] | unique[]'

# Resource requests
kubectl get pods -o json | jq '.items[] | {name: .metadata.name, cpu: .spec.containers[0].resources.requests.cpu}'

yq for YAML

yq uses the same syntax as jq but operates on YAML files.

Installation

# Arch Linux
sudo pacman -S yq

# RHEL/Rocky (via pip)
pip install yq

# Homebrew
brew install yq
There are two implementations of yq. This guide covers the Go implementation (mikefarah/yq) which is more common.

Basic Operations

# Read value
yq '.metadata.name' deployment.yaml

# Read with output format
yq -o json '.spec' deployment.yaml

# Update value
yq '.spec.replicas = 3' deployment.yaml

# Delete key
yq 'del(.spec.template.spec.volumes)' deployment.yaml

# Add new key
yq '.metadata.labels.environment = "production"' deployment.yaml

Multi-Document YAML

# Select specific document
yq 'select(documentIndex == 0)' multi.yaml

# Process all documents
yq 'select(.kind == "Deployment")' multi.yaml

Convert Between Formats

# YAML to JSON
yq -o json '.' config.yaml > config.json

# JSON to YAML
yq -p json -o yaml '.' config.json > config.yaml

# Pretty print YAML
yq '.' config.yaml

Converting Text Output to JSON

Not everything outputs JSON natively. Use awk to convert structured text to JSON, then pipe to jq for colored output.

Pattern: Text → JSON → jq

command | awk -F'<sep>' '/<pattern>/ {printf "\"%s\": \"%s\",\n", $1, $2}' | sed '1s/^/{/; $s/,$/}/' | jq
         |______________|  |________|  |_________________________________|   |__________________|   |__|
           AWK parsing      Match       Format as JSON key:value pairs         Wrap in braces        Color

Example: hostnamectl to JSON

hostnamectl outputs key-value text. Convert to JSON:

hostnamectl | awk -F': ' '/Static hostname|Operating System|Kernel|Architecture/ {
  gsub(/^[[:space:]]+/, "", $1)
  gsub(/^[[:space:]]+/, "", $2)
  printf "\"%s\": \"%s\",\n", $1, $2
}' | sed '1s/^/{/; $s/,$/}/' | jq
Output (colored)
{
  "Static hostname": "supermicro300-9d1",
  "Operating System": "Arch Linux",
  "Kernel": "Linux 6.17.5-arch1-1",
  "Architecture": "x86-64"
}

Multi-Host Validation via SSH

Loop over hosts for infrastructure verification:

for host in kvm-01 bind-01 ipsk-mgr-01; do
  echo "=== $host ==="
  ssh -o ControlPath=none $host 'hostnamectl' | \
    awk -F': ' '/Static hostname|Operating System|Kernel|Architecture/ {
      gsub(/^[[:space:]]+/, "", $1)
      gsub(/^[[:space:]]+/, "", $2)
      printf "\"%s\": \"%s\",\n", $1, $2
    }' | sed '1s/^/{/; $s/,$/}/' | jq
done

AWK to JSON Pattern Breakdown

Component Purpose

-F': '

Set field separator to colon-space

gsub(/^+/, "", $1)

Trim leading whitespace from key

gsub(/^+/, "", $2)

Trim leading whitespace from value

printf "\"%s\": \"%s\",\n"

Format as JSON "key": "value",

sed Expression Purpose

1s/^/{/

Line 1: prepend {

$s/,$/}/

Last line: replace trailing , with }

When Native JSON Exists

Check if the command has JSON output first:

# hostnamectl has --json flag on systemd 250+
hostnamectl --json=short | jq 'keys'
hostnamectl --json lacks .Architecture field. Use awk conversion when you need fields not in native JSON.

Generic Template

# Convert any "Key: Value" output to JSON
<command> | awk -F': ' '/<pattern1>|<pattern2>/ {
  gsub(/^[[:space:]]+/, "", $1)
  gsub(/^[[:space:]]+/, "", $2)
  printf "\"%s\": \"%s\",\n", $1, $2
}' | sed '1s/^/{/; $s/,$/}/' | jq

Advanced Techniques

Recursive Descent

Find all values matching a pattern anywhere in the structure:

# Find all "name" fields at any depth
jq '.. | .name? // empty' data.json

# Find all IP addresses
jq -r '.. | strings | select(test("^[0-9]+\\.[0-9]+\\.[0-9]+\\.[0-9]+$"))' data.json

Building Objects from Arrays

# Create object from key-value pairs
echo '[{"key": "name", "value": "vault-01"}, {"key": "ip", "value": "10.50.1.60"}]' | \
  jq 'from_entries'
Output
{"name": "vault-01", "ip": "10.50.1.60"}

Slurp Multiple Files

# Merge multiple JSON files into array
jq -s '.' file1.json file2.json file3.json

# Combine with processing
jq -s '[.[] | select(.status == "active")]' *.json

Environment Variables

# Use env var in filter
jq --arg name "$HOSTNAME" '.[] | select(.name == $name)' data.json

# Use env var for substitution
jq --arg ip "$NEW_IP" '.ip = $ip' data.json

String Manipulation

# Split string
jq '.fqdn | split(".")[0]' data.json

# Join array to string
jq '.tags | join(", ")' data.json

# Uppercase/lowercase
jq '.name | ascii_upcase' data.json

# Regex replace
jq '.name | gsub("old"; "new")' data.json

Debugging Tips

Check Structure Step by Step

# Don't do this (complex query that fails)
cat response.json | jq '.data.items[].metadata.labels | select(.app == "vault")'

# Do this instead (build incrementally)
cat response.json | jq 'type'                    # object
cat response.json | jq 'keys'                    # ["data", "meta"]
cat response.json | jq '.data | type'            # object
cat response.json | jq '.data | keys'            # ["items"]
cat response.json | jq '.data.items | type'      # array
cat response.json | jq '.data.items[0] | keys'   # ["metadata", "spec"]
# Now you know the path

Handle Null Values

# This errors on null
jq '.missing.field' data.json  # error: null has no keys

# Use optional operator
jq '.missing?.field?' data.json  # null (no error)

# Default value
jq '.missing // "default"' data.json

Pretty Print for Inspection

# Colored output (default)
jq '.' data.json

# Compact output
jq -c '.' data.json

# Raw strings (no quotes)
jq -r '.name' data.json

# Sort keys
jq -S '.' data.json

Common Flags

Flag Purpose

-r

Raw output (no JSON formatting, no quotes on strings)

-c

Compact output (one line)

-s

Slurp (read all inputs into array)

-S

Sort object keys

-e

Exit with error if result is false or null

--arg name value

Pass string variable

--argjson name value

Pass JSON variable

-f file.jq

Read filter from file

API Response Patterns

ISE Session Table (Formatted Output)

netapi ise --format json mnt sessions | jq -r '
  ["MAC", "USER", "IP", "STATE"],
  (.[] | [.calling_station_id, .user_name, .framed_ip_address, .session_state]) |
  @tsv
' | column -t

Vault PKI Certificate Inventory

vault list -format=json pki_int/certs | jq -r '.[]' | while read serial; do
  vault read -format=json pki_int/cert/$serial | jq -r '
    [.data.serial_number, (.data.expiration | strftime("%Y-%m-%d")), .data.common_name] | @tsv
  '
done | column -t -N "SERIAL,EXPIRES,CN"

pfSense Firewall Rules Summary

netapi pfsense rules list --format json | jq -r '
  .[] | [.interface, .type, .source, .destination, .descr] | @tsv
' | column -t -N "IFACE,TYPE,SRC,DST,DESC"

Error Handling Patterns

Safe Navigation (Optional Operator)

# Without ? - errors on missing key
jq '.missing.field' data.json  # ERROR

# With ? - returns null
jq '.missing?.field?' data.json  # null

# With default
jq '.missing?.field? // "N/A"' data.json  # "N/A"

Try-Catch Pattern

jq 'try .data.items[] catch "No items found"' data.json

Validate Before Processing

jq 'if type == "array" then .[] else error("Expected array") end' data.json

Output Formatting

Table Output with Headers

jq -r '["NAME", "IP", "STATUS"], (.[] | [.name, .ip, .status]) | @tsv' data.json | column -t

CSV Export

jq -r '.[] | [.name, .ip, .status] | @csv' data.json > export.csv

Shell Variables

# Single value
IP=$(jq -r '.server.ip' config.json)

# Multiple values via eval
eval "$(jq -r '@sh "NAME=\(.name) IP=\(.ip)"' config.json)"
echo "$NAME is at $IP"

Performance Tips

Stream Large Files

# Don't load entire file into memory
jq --stream 'select(.[0][-1] == "name") | .[1]' huge.json

First Match Only (Early Exit)

jq 'first(.[] | select(.status == "active"))' data.json

Limit Output

jq '.[0:10]' huge_array.json  # First 10 only

Resources

The Single Most Important Pattern:

When exploring unknown JSON, always start with:

curl -s "$URL" | jq -r 'to_entries[] | "\(.key): \(.value | type)"'

This shows you the structure before you try to query it.

Infrastructure Automation Deep Dive

netapi ISE Patterns

Session Analysis Pipeline

# Active sessions with timing data
netapi ise --format json mnt sessions | jq -r '
  .[] | select(.session_state == "AUTHENTICATED") |
  {
    mac: .calling_station_id,
    user: .user_name,
    ip: .framed_ip_address,
    started: (.acct_session_time | tonumber | . / 60 | floor | "\(.)m"),
    policy: .access_policy_matched_rule,
    profile: .authorization_profile
  }
' | jq -s 'sort_by(.started | rtrimstr("m") | tonumber) | reverse'

Policy Set Dependency Mapping

# Find which policy sets reference specific authorization profiles
PROFILE="Corp-Access"
netapi ise --format json get-policy-sets | jq --arg p "$PROFILE" '
  .[] | select(.rules[]?.authzProfile == $p) |
  {
    policySet: .name,
    matchingRules: [.rules[] | select(.authzProfile == $p) | .name]
  }
'

dACL Content Extraction and Comparison

# Extract and format dACL rules for diff
netapi ise api-call openapi GET '/api/v1/policy/network-access/downloadable-acl' | \
  jq -r '.response[] | select(.name | test("Linux")) |
    "=== \(.name) ===\n\(.dacl)"' | sed 's/\\n/\n/g'

# Compare two dACLs side by side
DACL1="Linux-Research-Onboard"
DACL2="Linux-Research-EAP-TLS"
diff -u \
  <(netapi ise api-call openapi GET '/api/v1/policy/network-access/downloadable-acl' | \
    jq -r --arg d "$DACL1" '.response[] | select(.name == $d) | .dacl' | tr '\\n' '\n') \
  <(netapi ise api-call openapi GET '/api/v1/policy/network-access/downloadable-acl' | \
    jq -r --arg d "$DACL2" '.response[] | select(.name == $d) | .dacl' | tr '\\n' '\n')

Endpoint Identity Group Inventory

# Endpoint groups with member counts
netapi ise api-call openapi GET '/api/v1/endpoint-identity-group?size=100' | \
  jq -r '["GROUP", "DESCRIPTION", "SYSTEM"],
    (.response[] | [.name, (.description // "N/A"), (if .systemDefined then "yes" else "no" end)]) | @tsv' | \
  column -t -s$'\t'

Certificate Trust Store Audit

# Certificates expiring within 90 days
netapi ise api-call openapi GET '/api/v1/certs/trusted-certificate?size=100' | jq -r '
  .response[] |
  (.expirationDate | split("T")[0] | strptime("%Y-%m-%d") | mktime) as $exp |
  (now | . + 7776000) as $cutoff |
  select($exp < $cutoff) |
  [.friendlyName, .expirationDate | split("T")[0], .issuedBy] | @tsv
' | column -t -N "NAME,EXPIRES,ISSUER"

netapi pfSense Patterns

Firewall Rule Audit

# Rules with tracking disabled (potential logging gap)
netapi pfsense rules list --format json | jq '
  .[] | select(.log == false) | {interface, type, source, destination, descr}
'

# Count rules per interface
netapi pfsense rules list --format json | jq -r '
  group_by(.interface) |
  map({interface: .[0].interface, count: length}) |
  sort_by(.count) | reverse |
  .[] | [.interface, .count] | @tsv
' | column -t -N "INTERFACE,RULES"

DNS Override Management

# Find duplicates (same host, different IPs)
netapi pfsense dns list --format json | jq '
  group_by(.host) |
  map(select(length > 1)) |
  .[] | {host: .[0].host, ips: [.[] | .ip]}
'

# Orphaned entries (hosts not responding)
netapi pfsense dns list --format json | jq -r '.[] | .ip' | \
  xargs -P 10 -I {} sh -c 'ping -c 1 -W 1 {} >/dev/null 2>&1 || echo {}'

netapi WLC Patterns

Client Distribution Analysis

# Clients per WLAN
netapi wlc clients --format json | jq -r '
  group_by(.wlan_profile_name) |
  map({wlan: .[0].wlan_profile_name, clients: length}) |
  sort_by(.clients) | reverse |
  .[] | [.wlan, .clients] | @tsv
' | column -t -N "WLAN,CLIENTS"

# AP load distribution
netapi wlc clients --format json | jq -r '
  group_by(.ap_name) |
  map({ap: .[0].ap_name, clients: length, channels: ([.[] | .channel] | unique)}) |
  sort_by(.clients) | reverse |
  .[] | [.ap, .clients, (.channels | join(","))] | @tsv
' | column -t -N "AP,CLIENTS,CHANNELS"

Rogue Detection

# Rogues with signal strength > -70 dBm (nearby threats)
netapi wlc rogues --format json | jq '
  .[] | select((.rssi | tonumber) > -70) |
  {mac: .mac_address, rssi: .rssi, ssid: .ssid, detecting_ap: .detecting_aps[0]}
'

kubectl Advanced Patterns

Resource Utilization Report

# Pod resource requests vs limits analysis
kubectl get pods -A -o json | jq -r '
  .items[] |
  .spec.containers[] as $c |
  [
    .metadata.namespace,
    .metadata.name,
    $c.name,
    ($c.resources.requests.cpu // "none"),
    ($c.resources.limits.cpu // "none"),
    ($c.resources.requests.memory // "none"),
    ($c.resources.limits.memory // "none")
  ] | @tsv
' | column -t -N "NS,POD,CONTAINER,CPU_REQ,CPU_LIM,MEM_REQ,MEM_LIM"

Event Timeline Reconstruction

# Recent events by component, color-coded by type
kubectl get events -A --sort-by='.lastTimestamp' -o json | jq -r '
  .items[-20:] | reverse | .[] |
  [
    .lastTimestamp | split("T") | .[1][:8],
    .involvedObject.namespace,
    .involvedObject.name,
    .type,
    .reason,
    .message[:60]
  ] | @tsv
' | awk -F'\t' '{
  color = ($4 == "Warning") ? "\033[33m" : "\033[32m"
  printf "%s\t%s\t%s\t%s%s\033[0m\t%s\t%s\n", $1, $2, $3, color, $4, $5, $6
}' | column -t -s$'\t'

Service Endpoint Mapping

# Services with their backend endpoints and health
kubectl get endpoints -A -o json | jq -r '
  .items[] | select(.subsets != null) |
  .subsets[] as $s |
  ($s.addresses // []) as $addrs |
  ($s.notReadyAddresses // []) as $notReady |
  {
    namespace: .metadata.namespace,
    service: .metadata.name,
    ready: ($addrs | length),
    notReady: ($notReady | length),
    total: (($addrs | length) + ($notReady | length))
  }
' | jq -r '["NS", "SERVICE", "READY", "NOT_READY", "TOTAL"], ([.namespace, .service, .ready, .notReady, .total]) | @tsv' | column -t

jq-fu: Advanced Transformation Patterns

Data Pivoting

Long to Wide Format

# Input: [{host: "a", metric: "cpu", value: 50}, {host: "a", metric: "mem", value: 70}]
# Output: {a: {cpu: 50, mem: 70}}
jq 'group_by(.host) | map({(.[0].host): (map({(.metric): .value}) | add)}) | add'

Wide to Long Format

# Input: {a: {cpu: 50, mem: 70}}
# Output: [{host: "a", metric: "cpu", value: 50}, ...]
jq 'to_entries | map(.key as $h | .value | to_entries | map({host: $h, metric: .key, value: .value})) | flatten'

Time Series Manipulation

Rolling Window Aggregation

# Calculate 5-minute rolling average (data every minute)
jq '
  [range(4; length)] as $indices |
  [$indices[] | . as $i | .[($i-4):($i+1)] | {
    timestamp: .[-1].timestamp,
    avg_value: ([.[] | .value] | add / length)
  }]
'

Gap Detection

# Find gaps > 5 minutes in time series
jq '
  [., .[1:]] | transpose |
  map(select(
    (.[1].timestamp | fromdateiso8601) - (.[0].timestamp | fromdateiso8601) > 300
  )) |
  map({
    from: .[0].timestamp,
    to: .[1].timestamp,
    gap_seconds: ((.[1].timestamp | fromdateiso8601) - (.[0].timestamp | fromdateiso8601))
  })
'

Hierarchical Data Operations

Flatten Nested Structure with Path

# Flatten any depth, preserving path
jq '
  [paths(scalars)] as $paths |
  [$paths[] | {path: (. | join(".")), value: getpath(.)}]
'

Reconstruct from Flattened

# Rebuild nested structure from flat path.value pairs
jq '
  reduce .[] as $item ({}; setpath($item.path | split("."); $item.value))
'

Set Operations

Find Changes Between Two API Responses

# Compare two snapshots: added, removed, changed
OLD="snapshot_old.json"
NEW="snapshot_new.json"

# Added items
jq -n --slurpfile old "$OLD" --slurpfile new "$NEW" '
  ($new[0] | map(.id) | sort) - ($old[0] | map(.id) | sort) |
  . as $added | $new[0] | map(select(.id | IN($added[])))
'

# Removed items
jq -n --slurpfile old "$OLD" --slurpfile new "$NEW" '
  ($old[0] | map(.id) | sort) - ($new[0] | map(.id) | sort) |
  . as $removed | $old[0] | map(select(.id | IN($removed[])))
'

# Changed items (same ID, different content)
jq -n --slurpfile old "$OLD" --slurpfile new "$NEW" '
  ($old[0] | INDEX(.id)) as $oldIndex |
  $new[0] | map(select(
    .id as $id | $oldIndex[$id] != null and $oldIndex[$id] != .
  ))
'

Conditional Logic Patterns

Multi-Level Case Statement

jq '.[] | .severity = (
  if .priority < 3 then "critical"
  elif .priority < 5 then "high"
  elif .priority < 7 then "medium"
  else "low"
  end
)'

Null-Safe Deep Access

# Chain of optionals with default at each level
jq '.config?.network?.dns?.servers? // .defaults?.dns? // ["8.8.8.8"]'

Batching and Chunking

Split Array into Batches

# Split 1000 items into batches of 100
jq '[range(0; length; 100)] | map(. as $i | input | .[$i:$i+100])' data.json
# Simpler: use _nwise (jq 1.6+)
jq '[_nwise(100)]'

Process in Chunks with Progress

# Process large dataset showing progress
TOTAL=$(jq 'length' huge.json)
jq -c '.[]' huge.json | nl | while read num item; do
  echo "$item" | jq -c '{processed: .id}'
  echo -ne "\r$num/$TOTAL" >&2
done

The Composition Philosophy

jq as a Stage in Pipelines

Never try to do everything in jq. Use it for what it’s good at (JSON transformation) and let other tools handle the rest:

# BAD: Trying to do filesystem operations in jq
jq -r '.files[] | "mkdir -p \(.dir) && touch \(.name)"' | sh

# GOOD: jq extracts, xargs acts
jq -r '.files[] | "\(.dir)/\(.name)"' config.json | xargs -I {} dirname {} | sort -u | xargs mkdir -p
jq -r '.files[] | "\(.dir)/\(.name)"' config.json | xargs touch

The Three-Stage Pattern

Most infrastructure queries follow this pattern:

┌─────────────┐     ┌─────────────┐     ┌─────────────┐
│   Source    │ --> │  Transform  │ --> │   Present   │
│  (API/file) │     │    (jq)     │     │ (column/awk)│
└─────────────┘     └─────────────┘     └─────────────┘
# Source: Get raw data
netapi ise --format json get-policy-sets |

# Transform: Extract and shape with jq
jq -r '.[] | [.name, .state, .rank, (.rules | length)] | @tsv' |

# Present: Human-readable format
column -t -N "POLICY,STATE,RANK,RULES" |

# Optionally filter at the end
grep -v disabled

When to Use jq vs Other Tools

Task jq Alternative

Extract single value

jq -r '.field'

Overkill for simple access

Complex transformations

Yes

awk struggles with nested JSON

Tabular output

jq -r '…​ | @tsv' | column -t

Let column handle alignment

Text search in JSON

jq 'select(.name | test("pattern"))'

grep is faster for raw search

Numeric aggregation

jq '[.[] | .count] | add'

awk better for streaming

Line-by-line processing

jq -c '.[]' then pipe

xargs/while for actions

Debugging Complex Queries

The Progressive Refinement Method

# Start with the raw structure
cat response.json | jq '.'

# Understand top level
cat response.json | jq 'keys'

# Narrow to relevant section
cat response.json | jq '.data'

# Check type at each level
cat response.json | jq '.data | type'

# If array, sample one item
cat response.json | jq '.data[0]'

# Build your select
cat response.json | jq '.data[] | select(.active == true)'

# Add projection
cat response.json | jq '.data[] | select(.active == true) | {name, ip}'

# Format for output
cat response.json | jq -r '.data[] | select(.active == true) | [.name, .ip] | @tsv' | column -t

Debug Output

# Print intermediate values during processing
jq '.[] | . as $item | debug | select(.active) | $item.name'

# Conditional debug
jq '.[] | if .status == "error" then (. | debug) else . end | .name'

Common Errors and Solutions

Error Cause Fix

null (null) has no keys

Accessing missing field

Use optional ? or check has("key")

Cannot iterate over null

Array is null

(.array // [])[]

Cannot index string with string

Treating string as object

Check type at that path

Output is escaped JSON

Missing -r flag

Add -r for raw strings

Unicode in output

Terminal encoding

| @json then decode

See Also