jq & yq Mastery
Overview
jq is a lightweight command-line JSON processor. Think of it as SQL for JSON - you query, filter, transform, and extract data from JSON streams.
yq extends this paradigm to YAML (and can convert between formats).
|
Master these tools. Every API returns JSON. Every config file is YAML. These are non-negotiable skills for infrastructure work. |
|
Deep Dives Available:
|
Quick Reference
| Operation | jq | Notes |
|---|---|---|
Show all keys |
|
Top-level keys only |
Show nested keys |
|
Keys of |
Keys with types |
|
Exploration pattern |
Select all items |
|
Like |
Filter (WHERE) |
|
Exact match |
Filter (LIKE) |
|
Regex match |
Project fields |
|
Like |
Raw output |
|
No quotes, no colors |
jq vs SQL Mental Model
| SQL | jq | Example |
|---|---|---|
|
|
Iterate all items |
|
|
Exact match |
|
|
Regex match |
|
|
Project specific fields |
|
|
String interpolation |
|
|
Sorting |
|
|
Array slicing |
|
|
Count items |
Exploration Patterns
When encountering unknown JSON, explore systematically:
Step 1: What Type Is It?
curl -s "$API_URL" | jq 'type'
"object" # or "array", "string", "number", "boolean", "null"
Step 3: Keys with Types (Power Pattern)
curl -s "$API_URL" | jq -r 'to_entries[] | "\(.key): \(.value | type)"'
data: object meta: object links: array
Essential Patterns
Array Operations
# First item
jq '.[0]' data.json
# Last item
jq '.[-1]' data.json
# First 5 items
jq '.[0:5]' data.json
# Length
jq 'length' data.json
# Map over array (transform each item)
jq '.[] | {name: .name, id: .id}' data.json
Filtering
# Exact match
jq '.[] | select(.status == "active")' data.json
# Not equal
jq '.[] | select(.status != "deleted")' data.json
# Regex match
jq '.[] | select(.name | test("vault"))' data.json
# Case-insensitive regex
jq '.[] | select(.name | test("vault"; "i"))' data.json
# Multiple conditions (AND)
jq '.[] | select(.status == "active" and .type == "server")' data.json
# Multiple conditions (OR)
jq '.[] | select(.status == "active" or .status == "pending")' data.json
# Contains (for strings)
jq '.[] | select(.name | contains("vault"))' data.json
# Has key
jq '.[] | select(has("metadata"))' data.json
# Null check
jq '.[] | select(.error == null)' data.json
Transformation
# Rename keys
jq '.[] | {server_name: .name, server_ip: .ip}' data.json
# Add computed field
jq '.[] | . + {full_name: "\(.first) \(.last)"}' data.json
# Delete key
jq 'del(.password)' data.json
# Update value
jq '.status = "modified"' data.json
# Conditional value
jq '.[] | .status_text = if .active then "ACTIVE" else "INACTIVE" end' data.json
Aggregation
# Count items
jq 'length' data.json
# Unique values
jq '[.[] | .status] | unique' data.json
# Group by field
jq 'group_by(.status)' data.json
# Count by group
jq 'group_by(.status) | map({status: .[0].status, count: length})' data.json
# Sum numeric field
jq '[.[] | .count] | add' data.json
# Min/Max
jq '[.[] | .count] | min' data.json
jq '[.[] | .count] | max' data.json
Real-World Examples
Vault API Responses
# List secret keys
vault kv list -format=json secret/ | jq -r '.[]'
# Extract certificate from Vault response
vault write -format=json pki/issue/role common_name="host.example.com" | jq -r '.data.certificate'
# Show all fields and types from PKI response
vault write -format=json pki/issue/role common_name="host.example.com" | \
jq -r '.data | to_entries[] | "\(.key): \(.value | type)"'
ca_chain: array certificate: string expiration: number issuing_ca: string private_key: string private_key_type: string serial_number: string
ISE API (via netapi)
# List all policy sets
netapi ise --format json get-policy-sets | jq '.[] | {name, state, rank}'
# Find disabled policy sets
netapi ise --format json get-policy-sets | jq '.[] | select(.state == "disabled") | .name'
# Get authz profile with dACL name
netapi ise --format json get-authz-profiles | \
jq '.[] | select(.name | test("Admin")) | {name, dacl: .daclName}'
pfSense DNS Overrides
# Find entries matching pattern
netapi pfsense dns list --format json | jq '.[] | select(.host | test("vault"))'
# Extract host and IP only
netapi pfsense dns list --format json | jq '.[] | "\(.host): \(.ip)"'
# Count entries per domain
netapi pfsense dns list --format json | jq 'group_by(.domain) | map({domain: .[0].domain, count: length})'
Git Forge CLIs (gh, glab, tea)
# GitHub: List private repos
gh repo list --limit 100 --json name,visibility | jq '.[] | select(.visibility=="PRIVATE")'
# GitHub: Get repo name for variable
REPO_NAME=$(gh repo view --json name -q '.name')
# GitHub: Repos with topics
gh repo list --json name,repositoryTopics | jq '.[] | select(.repositoryTopics | length > 0)'
# GitLab: List private repos
glab repo list --output json | jq '.[] | select(.visibility=="private")'
# GitLab: Repos with recent activity
glab repo list --output json | jq '.[] | select(.last_activity_at > "2026-01-01")'
# GitLab: Extract SSH URLs
glab repo list --output json | jq -r '.[] | .ssh_url_to_repo'
# tea (Gitea): List repos with details
tea repo list --output json | jq '.[] | {name, private, clone_url}'
# All forges: Extract repo names to array
gh repo list --json name -q '.[].name' > github_repos.txt
glab repo list --output json | jq -r '.[].path' > gitlab_repos.txt
tea repo list --output json | jq -r '.[].name' > gitea_repos.txt
Security Scanning (Dependabot / Vulnerability Reports)
# GitHub: List all Dependabot alerts
gh api repos/{owner}/{repo}/dependabot/alerts | jq '.[] | {package: .dependency.package.name, severity: .security_advisory.severity, summary: .security_advisory.summary}'
# GitHub: High severity only
gh api repos/{owner}/{repo}/dependabot/alerts | jq '.[] | select(.security_advisory.severity == "high") | {package: .dependency.package.name, file: .dependency.manifest_path, summary: .security_advisory.summary}'
# GitHub: Count by severity
gh api repos/{owner}/{repo}/dependabot/alerts | jq 'group_by(.security_advisory.severity) | map({severity: .[0].security_advisory.severity, count: length})'
# GitHub: Alerts by manifest file (find which package.json is the problem)
gh api repos/{owner}/{repo}/dependabot/alerts | jq 'group_by(.dependency.manifest_path) | map({file: .[0].dependency.manifest_path, count: length, packages: [.[] | .dependency.package.name] | unique})'
# GitHub: Open alerts with CVE IDs
gh api repos/{owner}/{repo}/dependabot/alerts | jq '.[] | select(.state == "open") | {cve: .security_advisory.cve_id, package: .dependency.package.name, severity: .security_advisory.severity}'
# GitLab: Project vulnerability report
glab api projects/:id/vulnerability_findings | jq '.[] | {severity: .severity, package: .location.dependency.package.name, scanner: .scanner.name}'
# GitLab: Critical/High vulnerabilities only
glab api projects/:id/vulnerability_findings | jq '.[] | select(.severity == "critical" or .severity == "high") | {name: .name, severity: .severity, state: .state}'
# GitLab: Count by scanner type
glab api projects/:id/vulnerability_findings | jq 'group_by(.scanner.name) | map({scanner: .[0].scanner.name, count: length})'
| Gitea does not have built-in vulnerability scanning. Use GitHub/GitLab for security audits or integrate external tools like Trivy. |
kubectl Output
# List pod names
kubectl get pods -o json | jq -r '.items[].metadata.name'
# Pods not running
kubectl get pods -o json | jq -r '.items[] | select(.status.phase != "Running") | .metadata.name'
# Container images in use
kubectl get pods -o json | jq -r '[.items[].spec.containers[].image] | unique[]'
# Resource requests
kubectl get pods -o json | jq '.items[] | {name: .metadata.name, cpu: .spec.containers[0].resources.requests.cpu}'
yq for YAML
yq uses the same syntax as jq but operates on YAML files.
Installation
# Arch Linux
sudo pacman -S yq
# RHEL/Rocky (via pip)
pip install yq
# Homebrew
brew install yq
There are two implementations of yq. This guide covers the Go implementation (mikefarah/yq) which is more common.
|
Basic Operations
# Read value
yq '.metadata.name' deployment.yaml
# Read with output format
yq -o json '.spec' deployment.yaml
# Update value
yq '.spec.replicas = 3' deployment.yaml
# Delete key
yq 'del(.spec.template.spec.volumes)' deployment.yaml
# Add new key
yq '.metadata.labels.environment = "production"' deployment.yaml
Converting Text Output to JSON
Not everything outputs JSON natively. Use awk to convert structured text to JSON, then pipe to jq for colored output.
Pattern: Text → JSON → jq
command | awk -F'<sep>' '/<pattern>/ {printf "\"%s\": \"%s\",\n", $1, $2}' | sed '1s/^/{/; $s/,$/}/' | jq
|______________| |________| |_________________________________| |__________________| |__|
AWK parsing Match Format as JSON key:value pairs Wrap in braces Color
Example: hostnamectl to JSON
hostnamectl outputs key-value text. Convert to JSON:
hostnamectl | awk -F': ' '/Static hostname|Operating System|Kernel|Architecture/ {
gsub(/^[[:space:]]+/, "", $1)
gsub(/^[[:space:]]+/, "", $2)
printf "\"%s\": \"%s\",\n", $1, $2
}' | sed '1s/^/{/; $s/,$/}/' | jq
{
"Static hostname": "supermicro300-9d1",
"Operating System": "Arch Linux",
"Kernel": "Linux 6.17.5-arch1-1",
"Architecture": "x86-64"
}
Multi-Host Validation via SSH
Loop over hosts for infrastructure verification:
for host in kvm-01 bind-01 ipsk-mgr-01; do
echo "=== $host ==="
ssh -o ControlPath=none $host 'hostnamectl' | \
awk -F': ' '/Static hostname|Operating System|Kernel|Architecture/ {
gsub(/^[[:space:]]+/, "", $1)
gsub(/^[[:space:]]+/, "", $2)
printf "\"%s\": \"%s\",\n", $1, $2
}' | sed '1s/^/{/; $s/,$/}/' | jq
done
AWK to JSON Pattern Breakdown
| Component | Purpose |
|---|---|
|
Set field separator to colon-space |
Trim leading whitespace from key |
|
Trim leading whitespace from value |
|
|
Format as JSON |
| sed Expression | Purpose |
|---|---|
|
Line 1: prepend |
|
Last line: replace trailing |
Advanced Techniques
Recursive Descent
Find all values matching a pattern anywhere in the structure:
# Find all "name" fields at any depth
jq '.. | .name? // empty' data.json
# Find all IP addresses
jq -r '.. | strings | select(test("^[0-9]+\\.[0-9]+\\.[0-9]+\\.[0-9]+$"))' data.json
Building Objects from Arrays
# Create object from key-value pairs
echo '[{"key": "name", "value": "vault-01"}, {"key": "ip", "value": "10.50.1.60"}]' | \
jq 'from_entries'
{"name": "vault-01", "ip": "10.50.1.60"}
Slurp Multiple Files
# Merge multiple JSON files into array
jq -s '.' file1.json file2.json file3.json
# Combine with processing
jq -s '[.[] | select(.status == "active")]' *.json
Debugging Tips
Check Structure Step by Step
# Don't do this (complex query that fails)
cat response.json | jq '.data.items[].metadata.labels | select(.app == "vault")'
# Do this instead (build incrementally)
cat response.json | jq 'type' # object
cat response.json | jq 'keys' # ["data", "meta"]
cat response.json | jq '.data | type' # object
cat response.json | jq '.data | keys' # ["items"]
cat response.json | jq '.data.items | type' # array
cat response.json | jq '.data.items[0] | keys' # ["metadata", "spec"]
# Now you know the path
Common Flags
| Flag | Purpose |
|---|---|
|
Raw output (no JSON formatting, no quotes on strings) |
|
Compact output (one line) |
|
Slurp (read all inputs into array) |
|
Sort object keys |
|
Exit with error if result is false or null |
|
Pass string variable |
|
Pass JSON variable |
|
Read filter from file |
API Response Patterns
ISE Session Table (Formatted Output)
netapi ise --format json mnt sessions | jq -r '
["MAC", "USER", "IP", "STATE"],
(.[] | [.calling_station_id, .user_name, .framed_ip_address, .session_state]) |
@tsv
' | column -t
Error Handling Patterns
Output Formatting
Resources
-
jq Playground - Test queries interactively
|
The Single Most Important Pattern: When exploring unknown JSON, always start with:
This shows you the structure before you try to query it. |
Infrastructure Automation Deep Dive
netapi ISE Patterns
Session Analysis Pipeline
# Active sessions with timing data
netapi ise --format json mnt sessions | jq -r '
.[] | select(.session_state == "AUTHENTICATED") |
{
mac: .calling_station_id,
user: .user_name,
ip: .framed_ip_address,
started: (.acct_session_time | tonumber | . / 60 | floor | "\(.)m"),
policy: .access_policy_matched_rule,
profile: .authorization_profile
}
' | jq -s 'sort_by(.started | rtrimstr("m") | tonumber) | reverse'
Policy Set Dependency Mapping
# Find which policy sets reference specific authorization profiles
PROFILE="Corp-Access"
netapi ise --format json get-policy-sets | jq --arg p "$PROFILE" '
.[] | select(.rules[]?.authzProfile == $p) |
{
policySet: .name,
matchingRules: [.rules[] | select(.authzProfile == $p) | .name]
}
'
dACL Content Extraction and Comparison
# Extract and format dACL rules for diff
netapi ise api-call openapi GET '/api/v1/policy/network-access/downloadable-acl' | \
jq -r '.response[] | select(.name | test("Linux")) |
"=== \(.name) ===\n\(.dacl)"' | sed 's/\\n/\n/g'
# Compare two dACLs side by side
DACL1="Linux-Research-Onboard"
DACL2="Linux-Research-EAP-TLS"
diff -u \
<(netapi ise api-call openapi GET '/api/v1/policy/network-access/downloadable-acl' | \
jq -r --arg d "$DACL1" '.response[] | select(.name == $d) | .dacl' | tr '\\n' '\n') \
<(netapi ise api-call openapi GET '/api/v1/policy/network-access/downloadable-acl' | \
jq -r --arg d "$DACL2" '.response[] | select(.name == $d) | .dacl' | tr '\\n' '\n')
Endpoint Identity Group Inventory
# Endpoint groups with member counts
netapi ise api-call openapi GET '/api/v1/endpoint-identity-group?size=100' | \
jq -r '["GROUP", "DESCRIPTION", "SYSTEM"],
(.response[] | [.name, (.description // "N/A"), (if .systemDefined then "yes" else "no" end)]) | @tsv' | \
column -t -s$'\t'
Certificate Trust Store Audit
# Certificates expiring within 90 days
netapi ise api-call openapi GET '/api/v1/certs/trusted-certificate?size=100' | jq -r '
.response[] |
(.expirationDate | split("T")[0] | strptime("%Y-%m-%d") | mktime) as $exp |
(now | . + 7776000) as $cutoff |
select($exp < $cutoff) |
[.friendlyName, .expirationDate | split("T")[0], .issuedBy] | @tsv
' | column -t -N "NAME,EXPIRES,ISSUER"
netapi pfSense Patterns
Firewall Rule Audit
# Rules with tracking disabled (potential logging gap)
netapi pfsense rules list --format json | jq '
.[] | select(.log == false) | {interface, type, source, destination, descr}
'
# Count rules per interface
netapi pfsense rules list --format json | jq -r '
group_by(.interface) |
map({interface: .[0].interface, count: length}) |
sort_by(.count) | reverse |
.[] | [.interface, .count] | @tsv
' | column -t -N "INTERFACE,RULES"
DNS Override Management
# Find duplicates (same host, different IPs)
netapi pfsense dns list --format json | jq '
group_by(.host) |
map(select(length > 1)) |
.[] | {host: .[0].host, ips: [.[] | .ip]}
'
# Orphaned entries (hosts not responding)
netapi pfsense dns list --format json | jq -r '.[] | .ip' | \
xargs -P 10 -I {} sh -c 'ping -c 1 -W 1 {} >/dev/null 2>&1 || echo {}'
netapi WLC Patterns
Client Distribution Analysis
# Clients per WLAN
netapi wlc clients --format json | jq -r '
group_by(.wlan_profile_name) |
map({wlan: .[0].wlan_profile_name, clients: length}) |
sort_by(.clients) | reverse |
.[] | [.wlan, .clients] | @tsv
' | column -t -N "WLAN,CLIENTS"
# AP load distribution
netapi wlc clients --format json | jq -r '
group_by(.ap_name) |
map({ap: .[0].ap_name, clients: length, channels: ([.[] | .channel] | unique)}) |
sort_by(.clients) | reverse |
.[] | [.ap, .clients, (.channels | join(","))] | @tsv
' | column -t -N "AP,CLIENTS,CHANNELS"
kubectl Advanced Patterns
Resource Utilization Report
# Pod resource requests vs limits analysis
kubectl get pods -A -o json | jq -r '
.items[] |
.spec.containers[] as $c |
[
.metadata.namespace,
.metadata.name,
$c.name,
($c.resources.requests.cpu // "none"),
($c.resources.limits.cpu // "none"),
($c.resources.requests.memory // "none"),
($c.resources.limits.memory // "none")
] | @tsv
' | column -t -N "NS,POD,CONTAINER,CPU_REQ,CPU_LIM,MEM_REQ,MEM_LIM"
Event Timeline Reconstruction
# Recent events by component, color-coded by type
kubectl get events -A --sort-by='.lastTimestamp' -o json | jq -r '
.items[-20:] | reverse | .[] |
[
.lastTimestamp | split("T") | .[1][:8],
.involvedObject.namespace,
.involvedObject.name,
.type,
.reason,
.message[:60]
] | @tsv
' | awk -F'\t' '{
color = ($4 == "Warning") ? "\033[33m" : "\033[32m"
printf "%s\t%s\t%s\t%s%s\033[0m\t%s\t%s\n", $1, $2, $3, color, $4, $5, $6
}' | column -t -s$'\t'
Service Endpoint Mapping
# Services with their backend endpoints and health
kubectl get endpoints -A -o json | jq -r '
.items[] | select(.subsets != null) |
.subsets[] as $s |
($s.addresses // []) as $addrs |
($s.notReadyAddresses // []) as $notReady |
{
namespace: .metadata.namespace,
service: .metadata.name,
ready: ($addrs | length),
notReady: ($notReady | length),
total: (($addrs | length) + ($notReady | length))
}
' | jq -r '["NS", "SERVICE", "READY", "NOT_READY", "TOTAL"], ([.namespace, .service, .ready, .notReady, .total]) | @tsv' | column -t
jq-fu: Advanced Transformation Patterns
Data Pivoting
Time Series Manipulation
Rolling Window Aggregation
# Calculate 5-minute rolling average (data every minute)
jq '
[range(4; length)] as $indices |
[$indices[] | . as $i | .[($i-4):($i+1)] | {
timestamp: .[-1].timestamp,
avg_value: ([.[] | .value] | add / length)
}]
'
Gap Detection
# Find gaps > 5 minutes in time series
jq '
[., .[1:]] | transpose |
map(select(
(.[1].timestamp | fromdateiso8601) - (.[0].timestamp | fromdateiso8601) > 300
)) |
map({
from: .[0].timestamp,
to: .[1].timestamp,
gap_seconds: ((.[1].timestamp | fromdateiso8601) - (.[0].timestamp | fromdateiso8601))
})
'
Hierarchical Data Operations
Set Operations
Find Changes Between Two API Responses
# Compare two snapshots: added, removed, changed
OLD="snapshot_old.json"
NEW="snapshot_new.json"
# Added items
jq -n --slurpfile old "$OLD" --slurpfile new "$NEW" '
($new[0] | map(.id) | sort) - ($old[0] | map(.id) | sort) |
. as $added | $new[0] | map(select(.id | IN($added[])))
'
# Removed items
jq -n --slurpfile old "$OLD" --slurpfile new "$NEW" '
($old[0] | map(.id) | sort) - ($new[0] | map(.id) | sort) |
. as $removed | $old[0] | map(select(.id | IN($removed[])))
'
# Changed items (same ID, different content)
jq -n --slurpfile old "$OLD" --slurpfile new "$NEW" '
($old[0] | INDEX(.id)) as $oldIndex |
$new[0] | map(select(
.id as $id | $oldIndex[$id] != null and $oldIndex[$id] != .
))
'
Conditional Logic Patterns
The Composition Philosophy
jq as a Stage in Pipelines
Never try to do everything in jq. Use it for what it’s good at (JSON transformation) and let other tools handle the rest:
# BAD: Trying to do filesystem operations in jq
jq -r '.files[] | "mkdir -p \(.dir) && touch \(.name)"' | sh
# GOOD: jq extracts, xargs acts
jq -r '.files[] | "\(.dir)/\(.name)"' config.json | xargs -I {} dirname {} | sort -u | xargs mkdir -p
jq -r '.files[] | "\(.dir)/\(.name)"' config.json | xargs touch
The Three-Stage Pattern
Most infrastructure queries follow this pattern:
┌─────────────┐ ┌─────────────┐ ┌─────────────┐
│ Source │ --> │ Transform │ --> │ Present │
│ (API/file) │ │ (jq) │ │ (column/awk)│
└─────────────┘ └─────────────┘ └─────────────┘
# Source: Get raw data
netapi ise --format json get-policy-sets |
# Transform: Extract and shape with jq
jq -r '.[] | [.name, .state, .rank, (.rules | length)] | @tsv' |
# Present: Human-readable format
column -t -N "POLICY,STATE,RANK,RULES" |
# Optionally filter at the end
grep -v disabled
When to Use jq vs Other Tools
| Task | jq | Alternative |
|---|---|---|
Extract single value |
|
Overkill for simple access |
Complex transformations |
Yes |
awk struggles with nested JSON |
Tabular output |
|
Let column handle alignment |
Text search in JSON |
|
grep is faster for raw search |
Numeric aggregation |
|
awk better for streaming |
Line-by-line processing |
|
xargs/while for actions |
Debugging Complex Queries
The Progressive Refinement Method
# Start with the raw structure
cat response.json | jq '.'
# Understand top level
cat response.json | jq 'keys'
# Narrow to relevant section
cat response.json | jq '.data'
# Check type at each level
cat response.json | jq '.data | type'
# If array, sample one item
cat response.json | jq '.data[0]'
# Build your select
cat response.json | jq '.data[] | select(.active == true)'
# Add projection
cat response.json | jq '.data[] | select(.active == true) | {name, ip}'
# Format for output
cat response.json | jq -r '.data[] | select(.active == true) | [.name, .ip] | @tsv' | column -t
Debug Output
# Print intermediate values during processing
jq '.[] | . as $item | debug | select(.active) | $item.name'
# Conditional debug
jq '.[] | if .status == "error" then (. | debug) else . end | .name'
Common Errors and Solutions
| Error | Cause | Fix |
|---|---|---|
|
Accessing missing field |
Use optional |
|
Array is null |
|
|
Treating string as object |
Check |
Output is escaped JSON |
Missing |
Add |
Unicode in output |
Terminal encoding |
|
See Also
-
fx JSON Explorer - Interactive exploration before writing jq queries
-
xargs Mastery - Multiply jq output into parallel operations