Associative Arrays
Basic frequency count — key-value accumulator
cat <<'EOF' > /tmp/demo.txt
sshd
nginx
sshd
vault
nginx
sshd
vault
nginx
nginx
haproxy
EOF
awk '{count[$1]++} END{for(k in count) print k, count[k]}' /tmp/demo.txt
Detect duplicates — flag on second occurrence
cat <<'EOF' > /tmp/demo.txt
10.50.1.20
10.50.1.60
10.50.1.90
10.50.1.20
10.50.1.1
10.50.1.60
EOF
awk '{if($1 in count) print "Duplicate:", $1; count[$1]++}' /tmp/demo.txt
Delete array element
cat <<'EOF' > /tmp/demo.txt
ESTAB
TIME-WAIT
ESTAB
CLOSE-WAIT
TIME-WAIT
ESTAB
TIME-WAIT
CLOSE-WAIT
EOF
awk '{count[$1]++} END{delete count["TIME-WAIT"]; for(k in count) print k, count[k]}' /tmp/demo.txt
Check if key exists before accessing
printf '%s\n' vault-01 bind-01 unknown-host | awk 'BEGIN{
arr["vault-01"]="10.50.1.60"
arr["ise-01"]="10.50.1.20"
arr["bind-01"]="10.50.1.90"
} {if($1 in arr) print $1, "->", arr[$1]; else print $1, "-> not found"}'
Group values by key — concatenate second field per first field
cat <<'EOF' > /tmp/demo.txt
web nginx
web haproxy
db postgres
db redis
web varnish
auth vault
auth ise
db mysql
EOF
awk '{a[$1] = a[$1] ? a[$1]", "$2 : $2} END{for(k in a) print k": " a[k]}' /tmp/demo.txt
Build lookup table in BEGIN — static host-to-IP mapping
cat <<'EOF' > /tmp/hostlist.txt
vault-01
ise-01
bind-01
pfsense-gw
unknown-box
EOF
awk 'BEGIN{
hosts["vault-01"]="10.50.1.60"
hosts["ise-01"]="10.50.1.20"
hosts["bind-01"]="10.50.1.90"
hosts["pfsense-gw"]="10.50.1.1"
} {if($1 in hosts) print $1, "->", hosts[$1]; else print $1, "-> NO ENTRY"}' /tmp/hostlist.txt
Track unique values per key — which services from which hosts
cat <<'EOF' > /tmp/auth_data.txt
evan 2026-04-11 vault-01
evan 2026-04-11 ise-01
admin 2026-04-11 bind-01
evan 2026-04-11 vault-01
admin 2026-04-11 vault-01
root 2026-04-11 pfsense-gw
admin 2026-04-11 bind-01
EOF
awk '{hosts[$1][$3]=1} END{for(u in hosts){printf "%s: ", u; for(h in hosts[u]) printf "%s ", h; print ""}}' /tmp/auth_data.txt