Probability

Probability Basics

Definitions
Sample space S: set of all possible outcomes
Event A: subset of the sample space
Probability: P(A) = |A| / |S|    (equally likely outcomes)

Axioms:
  0 <= P(A) <= 1
  P(S) = 1
  P(∅) = 0
  P(A ∪ B) = P(A) + P(B) - P(A ∩ B)

Complement:
  P(A') = 1 - P(A)
  "Probability of NOT A"
  Often easier: P(at least one) = 1 - P(none)
Examples
Fair die: P(even) = 3/6 = 1/2
Two dice: P(sum = 7) = 6/36 = 1/6
Card draw: P(ace) = 4/52 = 1/13

Birthday problem:
  P(at least 2 share birthday in n people)
  = 1 - P(all different)
  = 1 - (365/365)(364/365)(363/365)...(365-n+1)/365
  n=23: P ≈ 50.7%
  n=50: P ≈ 97%

Hash collision analogy: birthday attack

Conditional Probability

Formulas
P(A|B) = P(A ∩ B) / P(B)    "probability of A given B"

Multiplication rule:
  P(A ∩ B) = P(A|B) × P(B) = P(B|A) × P(A)

Independence:
  A and B independent iff P(A ∩ B) = P(A) × P(B)
  iff P(A|B) = P(A)
  iff P(B|A) = P(B)
Bayes' theorem
P(A|B) = P(B|A) × P(A) / P(B)

Expanded form (total probability):
P(A|B) = P(B|A) × P(A) / [P(B|A)×P(A) + P(B|A')×P(A')]

Example: malware detection
  P(malware) = 0.01         (1% of files are malware)
  P(detect|malware) = 0.95  (95% detection rate)
  P(detect|clean) = 0.03    (3% false positive rate)

  P(malware|detect) = (0.95 × 0.01) / (0.95×0.01 + 0.03×0.99)
                    = 0.0095 / 0.0392
                    ≈ 0.242  (24.2%)

  Even with 95% detection, only 24% of alerts are real!
  This is why false positive rate matters enormously.

Common Distributions

Discrete distributions
Bernoulli(p):     Single trial, success (1) with probability p
  P(X=1) = p, P(X=0) = 1-p
  E[X] = p, Var(X) = p(1-p)

Binomial(n, p):   Number of successes in n independent trials
  P(X=k) = C(n,k) × p^k × (1-p)^(n-k)
  E[X] = np, Var(X) = np(1-p)

  Example: 100 packets, 2% loss rate
  P(exactly 5 lost) = C(100,5) × 0.02^5 × 0.98^95 ≈ 0.0353

Geometric(p):     Trials until first success
  P(X=k) = (1-p)^(k-1) × p
  E[X] = 1/p

  Example: average rolls until 6 = 1/(1/6) = 6

Poisson(λ):       Rare events in fixed interval
  P(X=k) = e^(-λ) × λ^k / k!
  E[X] = λ, Var(X) = λ

  Example: 3 incidents/month average
  P(0 incidents) = e^(-3) ≈ 0.050
  P(5+ incidents) = 1 - sum_{k=0}^{4} P(X=k) ≈ 0.185
Normal distribution
N(μ, σ^2):  Bell curve centered at mean μ, spread σ

68-95-99.7 rule:
  68% of data within 1σ of mean
  95% of data within 2σ of mean
  99.7% of data within 3σ of mean

Z-score: z = (x - μ) / σ    (how many SDs from mean)

Central Limit Theorem:
  Average of many independent random variables → approximately normal
  Regardless of original distribution
  This is why the normal distribution appears everywhere

Expected Value and Variance

Expected value
E[X] = sum of (value × probability) for all outcomes

E[X] = sum_{i} x_i × P(X = x_i)

Properties:
  E[aX + b] = a×E[X] + b         (linearity)
  E[X + Y] = E[X] + E[Y]         (always, even if dependent)
  E[XY] = E[X]×E[Y]              (only if independent)
Variance and standard deviation
Var(X) = E[(X - E[X])^2] = E[X^2] - (E[X])^2
SD(X) = sqrt(Var(X))

Properties:
  Var(aX + b) = a^2 × Var(X)
  Var(X + Y) = Var(X) + Var(Y)    (if independent)

Example: die roll
  E[X] = (1+2+3+4+5+6)/6 = 3.5
  E[X^2] = (1+4+9+16+25+36)/6 = 91/6
  Var(X) = 91/6 - 12.25 = 2.917
  SD(X) ≈ 1.708

Probability in Security

Brute force and key space
Key space K, brute force probability after t tries:
  P(crack in t tries) = t / |K|     (without replacement)
  P(crack in t tries) ≈ 1 - e^(-t/|K|)  (with replacement, large K)

Expected tries to crack: |K| / 2

AES-128: |K| = 2^128
  At 10^12 guesses/sec: 2^128 / 10^12 / (3.15×10^7 sec/yr)
  ≈ 10^19 years (safe)

4-digit PIN: |K| = 10^4 = 10,000
  Expected: 5,000 tries (seconds for a computer)
Availability calculations
Single system, uptime p:
  Availability = p

Two systems in series (both must work):
  A = p1 × p2

Two systems in parallel (either works):
  A = 1 - (1-p1)(1-p2)

Example: 99.9% uptime each
  Series: 0.999 × 0.999 = 0.998 (99.8%)
  Parallel: 1 - 0.001^2 = 0.999999 (99.9999%)

Nines:
  99%     = 3.65 days downtime/year
  99.9%   = 8.77 hours/year
  99.99%  = 52.6 minutes/year
  99.999% = 5.26 minutes/year