SIEM

SIEM fundamentals — log ingestion, correlation rules, and detection engineering patterns.

SIEM Concepts and Operations

Core Architecture

A SIEM ingests logs from diverse sources (firewalls, endpoints, servers, identity systems), normalizes them into a common schema, applies correlation rules to detect attack patterns, and surfaces alerts for triage. The value is not collection — syslog does that. The value is correlation across sources that no single device can see.

Log Source Onboarding

Check what log sources are actively sending — Wazuh manager
# Count events per agent in the last hour
/var/ossec/bin/agent_control -l | awk '{print $1, $2}'
Verify syslog reception on the SIEM collector — confirm data is arriving
# Watch for incoming syslog on UDP 514
sudo tcpdump -i any port 514 -c 20 -nn
Test syslog forwarding from a Linux host — send a test event
logger -p auth.warning -t SIEM_TEST "Test event from $(hostname) at $(date +%s)"

Normalization

Raw logs arrive in vendor-specific formats. Normalization maps fields to a common schema so correlation rules work across sources.

Common normalized fields across SIEM platforms
Timestamp     →  @timestamp / deviceTime / EventTime
Source IP      →  src_ip / sourceAddress / SrcAddr
Destination IP →  dst_ip / destinationAddress / DstAddr
Username       →  user / userName / Account.Name
Action         →  action / eventOutcome / Status
Event ID       →  rule.id / eventId / EventID
Parse a raw syslog line with awk — extract the five critical fields
# Generic syslog: timestamp, host, process, pid, message
awk '{print $1" "$2" "$3, $4, $5, $6}' /var/log/syslog | head -5

Correlation Rules — Common Detection Patterns

Brute force detection logic — 5+ failed logins from same source in 2 minutes
Rule trigger:
  Event type: authentication_failure
  Group by: src_ip
  Threshold: count >= 5
  Time window: 120 seconds
  Action: Alert (Medium), auto-block if count >= 20
Lateral movement detection — same credential on 3+ distinct hosts in 10 minutes
Rule trigger:
  Event type: authentication_success
  Group by: user
  Distinct count: dst_ip >= 3
  Time window: 600 seconds
  Exclude: service accounts (svc_*, krbtgt)
  Action: Alert (High)
Privilege escalation — user added to privileged group
Rule trigger:
  Event type: group_membership_change
  Target group: Domain Admins | Enterprise Admins | Administrators
  Action: Alert (Critical), notify SOC immediately
  Windows Event ID: 4728, 4732, 4756
Impossible travel — same user authenticates from two geolocations too far apart
Rule trigger:
  Event type: authentication_success
  Group by: user
  Condition: geo_distance(event[n-1].location, event[n].location) / time_delta > 900 km/h
  Action: Alert (High), require MFA re-verification

Alert Triage Workflow

1. ACKNOWLEDGE — Claim the alert, prevent duplicate investigation
2. CONTEXTUALIZE — Who/what/when/where. Pull 30 min of logs around the event
3. VALIDATE — Is this a true positive? Check against known baselines
4. SCOPE — Search for related events (same src_ip, same user, same technique)
5. ESCALATE or CLOSE — Document findings either way
Pull context around a suspicious event — 15 minutes before and after
# Wazuh: query alerts by agent and time range
curl -s -k -u "wazuh-wui:$WAZUH_API_PASS" \
  "https://localhost:55000/alerts?agents_list=003&limit=50&offset=0&sort=-timestamp" \
  | jq '.data.affected_items[] | {timestamp, rule_id: .rule.id, rule_desc: .rule.description, src_ip: .data.srcip}'

Retention Policies

Hot storage   (0-30 days)   — Full-text search, fast queries, SSD/NVMe
Warm storage  (30-90 days)  — Indexed but slower, HDD or compressed
Cold storage  (90-365 days) — Compressed archives, query requires rehydration
Frozen/Archive (1-7 years)  — Compliance retention, object storage (S3/MinIO)

Regulatory drivers:
  PCI-DSS:  1 year minimum, 3 months immediately available
  HIPAA:    6 years
  SOX:      7 years
  Internal: Define per data classification

Platform Comparison

Capability Wazuh QRadar Microsoft Sentinel

Deployment

Self-hosted (manager + agents)

Appliance or VM (all-in-one or distributed)

Cloud-native (Azure)

Cost Model

Open source (infra cost only)

Licensed per EPS

Pay per GB ingested

Query Language

Wazuh API + Elasticsearch DSL

AQL (Ariel Query Language)

KQL (Kusto Query Language)

Correlation

Rule XML with frequency/timeframe

Custom Rule Engine (CRE) with building blocks

Analytics rules + Fusion ML

SOAR Integration

Active response scripts

QRadar SOAR (Resilient)

Logic Apps / Playbooks

Strengths

FIM, vulnerability detection, compliance

Mature correlation, offense management

Cloud-native, Microsoft ecosystem, ML

Weakness

No native case management

High cost, complex tuning

Azure lock-in, ingestion cost at scale

Essential Health Checks

Verify SIEM is receiving expected log volume — detect silent failures
# Compare today's event count against yesterday's baseline
# A >30% drop suggests a log source went silent
wazuh_today=$(curl -s -k -u "wazuh-wui:$WAZUH_API_PASS" \
  "https://localhost:55000/overview/agents" | jq '.data.affected_items | length')
echo "Active agents: $wazuh_today"
Check for log source gaps — find agents that stopped reporting
/var/ossec/bin/agent_control -l | awk '$0 ~ /Disconnected/ {print}'