SQL Query Patterns

Query construction, execution order, filtering, set operations, and debugging strategies.

Query Construction Order

SQL has a logical execution order that differs from the written order. Understanding this prevents common mistakes.

FROM      → which tables
JOIN      → combine tables
WHERE     → filter rows (before grouping)
GROUP BY  → group rows
HAVING    → filter groups (after grouping)
SELECT    → choose columns, compute expressions
DISTINCT  → deduplicate
ORDER BY  → sort results
LIMIT     → restrict output count

You cannot use a column alias defined in SELECT inside WHERE, because WHERE executes before SELECT. You can use it in ORDER BY.

Filtering Patterns

WHERE with multiple conditions
SELECT * FROM devices
WHERE device_type = 'switch'
  AND vlan_id IN (10, 20, 30)
  AND hostname LIKE 'sw-core%'
  AND created_at >= '2026-01-01';
NULL handling — = does not work with NULL
-- Wrong: this matches nothing
SELECT * FROM devices WHERE vlan_id = NULL;

-- Correct
SELECT * FROM devices WHERE vlan_id IS NULL;
SELECT * FROM devices WHERE vlan_id IS NOT NULL;
BETWEEN — inclusive range
SELECT * FROM events
WHERE start_time BETWEEN '2026-04-01' AND '2026-04-30';

BETWEEN is inclusive on both ends. Equivalent to >= AND ⇐.

EXISTS — check for related rows (often faster than IN for large subqueries)
SELECT c.name FROM customers c
WHERE EXISTS (
    SELECT 1 FROM orders o
    WHERE o.customer_id = c.id AND o.total > 1000
);

Pattern Matching

LIKE — simple patterns (% = any chars, _ = single char)
SELECT * FROM devices WHERE hostname LIKE 'sw-core-%';
SELECT * FROM users WHERE email LIKE '%@example.com';
ILIKE — case-insensitive (PostgreSQL only)
SELECT * FROM users WHERE name ILIKE '%smith%';
SIMILAR TO / regex (PostgreSQL)
SELECT * FROM logs WHERE message ~ 'error|fail|timeout';
SELECT * FROM logs WHERE message ~* 'error|fail';  -- case-insensitive

Set Operations

UNION — combine results, deduplicate
SELECT hostname, 'cmdb' AS source FROM cmdb_devices
UNION
SELECT hostname, 'scan' AS source FROM scan_results;
UNION ALL — combine without deduplication (faster)
SELECT hostname FROM cmdb_devices
UNION ALL
SELECT hostname FROM scan_results;
INTERSECT — rows in both queries
SELECT hostname FROM cmdb_devices
INTERSECT
SELECT hostname FROM scan_results;
EXCEPT — rows in first but not second
-- Devices in CMDB but not found in scan
SELECT hostname FROM cmdb_devices
EXCEPT
SELECT hostname FROM scan_results;

Conditional Aggregation

Count by category with CASE
SELECT
    COUNT(*) AS total,
    COUNT(CASE WHEN status = 'active' THEN 1 END) AS active,
    COUNT(CASE WHEN status = 'inactive' THEN 1 END) AS inactive,
    COUNT(CASE WHEN status = 'decommissioned' THEN 1 END) AS decommissioned
FROM devices;
FILTER clause (PostgreSQL only, cleaner than CASE)
SELECT
    COUNT(*) AS total,
    COUNT(*) FILTER (WHERE status = 'active') AS active,
    COUNT(*) FILTER (WHERE status = 'inactive') AS inactive
FROM devices;

Pivot Patterns

SQL does not have a native PIVOT keyword (except SQL Server). Use conditional aggregation.

Pivot: rows to columns
-- Monthly order counts per status
SELECT
    DATE_TRUNC('month', created_at) AS month,
    COUNT(*) FILTER (WHERE status = 'completed') AS completed,
    COUNT(*) FILTER (WHERE status = 'pending') AS pending,
    COUNT(*) FILTER (WHERE status = 'cancelled') AS cancelled
FROM orders
GROUP BY 1
ORDER BY 1;

Deduplication Strategies

DISTINCT ON (PostgreSQL) — keep one row per group
-- Latest order per customer
SELECT DISTINCT ON (customer_id) *
FROM orders
ORDER BY customer_id, created_at DESC;
ROW_NUMBER window function (portable)
WITH ranked AS (
    SELECT *,
        ROW_NUMBER() OVER (PARTITION BY customer_id ORDER BY created_at DESC) AS rn
    FROM orders
)
SELECT * FROM ranked WHERE rn = 1;

Debugging Queries

Step-by-step debugging workflow
-- 1. Start with the raw data
SELECT * FROM orders LIMIT 10;

-- 2. Add the WHERE clause
SELECT * FROM orders WHERE status = 'pending' LIMIT 10;

-- 3. Add the JOIN
SELECT o.*, c.name
FROM orders o
JOIN customers c ON o.customer_id = c.id
WHERE o.status = 'pending'
LIMIT 10;

-- 4. Add aggregation
SELECT c.name, COUNT(*) AS pending_count
FROM orders o
JOIN customers c ON o.customer_id = c.id
WHERE o.status = 'pending'
GROUP BY c.name
ORDER BY pending_count DESC;

Build complex queries incrementally. Add one clause at a time and verify the intermediate result before adding the next.