Distributions and Density Functions

Probability theory provides the rigorous framework for modeling uncertainty in physical, social, and information systems. At the heart of this framework lie random variables and the functions that describe their probabilistic behavior. This lesson details the characterization of random variables, the catalog of fundamental distributions, and the calculus of transformations.

1. Probability Density Functions (PDF) and Cumulative Distribution Functions (CDF)

A Random Variable $X$ is a measurable function $X : Ω \to R$ mapping the sample space to the real line. The behavior of $X$ is fully characterized by its Cumulative Distribution Function (CDF):

$F_{X} (x) = P (X \leq x)$

The CDF is non-decreasing, right-continuous, and satisfies $lim_{x \to - \infty} F_{X} (x) = 0$ and $lim_{x \to \infty} F_{X} (x) = 1$ . For a Continuous Random Variable, the Probability Density Function (PDF) $f_{X} (x)$ exists such that:

$F_{X} (x) = \int_{- \infty}^{x} f_{X} (t) d t ⟺ f_{X} (x) = \frac{d}{d x} F_{X} (x)$

The fundamental properties of a PDF include:

$f_{X} (x) \geq 0$ (Non-negativity)
$\int_{- \infty}^{\infty} f_{X} (x) d x = 1$ (Normalization)

The probability that $X$ falls within an interval $[a, b]$ is given by: $P (a \leq X \leq b) = F_{X} (b) - F_{X} (a) = \int_{a}^{b} f_{X} (x) d x$

2. Catalog of Distributions

2.1 Discrete Distributions

Discrete distributions model variables that take values in a countable set.

Binomial Distribution ( $X \sim Bin (n, p)$ ): Models the number of successes in $n$ independent Bernoulli trials with success probability $p$ . $P (X = k) = (k n) p^{k} (1 - p)^{n - k}, k \in {0, 1, \dots, n}$ Expected Value: $E [X] = n p$ ; Variance: $Var (X) = n p (1 - p)$ .

Poisson Distribution ( $X \sim Pois (λ)$ ): Models the number of arrivals in a fixed interval given a constant average rate $λ$ . $P (X = k) = \frac{λ ^{k} e ^{- λ}}{k !}, k \in {0, 1, 2, \dots}$ The Poisson distribution is often used as a limit of the Binomial distribution as $n \to \infty$ and $p \to 0$ with $n p = λ$ .

2.2 Continuous Distributions

Gaussian (Normal) Distribution ( $X \sim N (μ, σ^{2})$ ): The most significant distribution in statistics due to the Central Limit Theorem (CLT), which asserts that the sum of independent and identically distributed (i.i.d.) random variables converges to a normal distribution. $f (x) = \frac{1}{σ 2 π} exp (- \frac{( x - μ ) ^{2}}{2 σ ^{2}})$

Exponential Distribution ( $X \sim Exp (λ)$ ): Models the time between events in a Poisson process. It is the only continuous distribution with the memoryless property: $P (X > s + t ∣ X > s) = P (X > t)$ PDF: $f (x) = λ e^{- λ x}$ for $x \geq 0$ .

Gamma Distribution: Generalizes the exponential distribution. $X \sim Gamma (α, β)$ models the time until the $α$ -th event. $f (x) = \frac{β ^{α}}{Γ ( α )} x^{α - 1} e^{- β x}, where Γ (α) = \int_{0}^{\infty} t^{α - 1} e^{- t} d t$

Beta Distribution: Defined on $[0, 1]$ , making it ideal for modeling probabilities or proportions. $f (x) = \frac{1}{B ( α , β )} x^{α - 1} (1 - x)^{β - 1}, B (α, β) = \frac{Γ ( α ) Γ ( β )}{Γ ( α + β )}$

3. Multivariate Distributions

In many applications, we track multiple random variables simultaneously. The Joint PDF $f_{X, Y} (x, y)$ describes their simultaneous behavior.

3.1 Marginal and Conditional Densities

The Marginal Density of $X$ is obtained by integrating out $Y$ : $f_{X} (x) = \int_{- \infty}^{\infty} f_{X, Y} (x, y) d y$

The Conditional Density of $Y$ given $X = x$ is: $f_{Y ∣ X} (y ∣ x) = \frac{f _{X, Y} ( x , y )}{f _{X} ( x )}$

3.2 Independence

Random variables $X$ and $Y$ are independent if and only if their joint density factors into their marginals: $f_{X, Y} (x, y) = f_{X} (x) f_{Y} (y)$ This implies that $f_{Y ∣ X} (y ∣ x) = f_{Y} (y)$ ; knowing $X$ provides no information about $Y$ .

4. Covariance and Correlation

Covariance measures the joint variability of two variables: $Cov (X, Y) = E [(X - E [X]) (Y - E [Y])] = E [X Y] - E [X] E [Y]$

For a random vector $X = [X_{1}, X_{2}, \dots, X_{n}]^{T}$ , we define the Covariance Matrix $Σ$ : $Σ = E [(X - E [X]) (X - E [X])^{T}]$ Where $Σ_{ij} = Cov (X_{i}, X_{j})$ . $Σ$ is always symmetric and positive semi-definite. If $X_{i}$ are independent, $Σ$ is diagonal.

5. Transformation of Random Variables

When we define a new variable $Y = g (X)$ , we must determine its density $f_{Y} (y)$ .

5.1 Univariate Transformation

If $g$ is strictly monotonic and differentiable: $f_{Y} (y) = f_{X} (g^{- 1} (y)) \frac{d}{d y} g^{- 1} (y)$

5.2 Multivariate Transformation and the Jacobian

For a vector transformation $Y = g (X)$ , let $h = g^{- 1}$ be the inverse mapping. The joint density follows: $f_{Y} (y) = f_{X} (h (y)) ∣ det (J_{h} (y)) ∣$ where $J_{h}$ is the Jacobian Matrix of the inverse transformation: $J_{h} = \frac{\partial ( x _{1} , \dots , x _{n} )}{\partial ( y _{1} , \dots , y _{n} )}$

6. Sampling Distributions

Statistical inference relies on the distributions of sample statistics.

Chi-squared Distribution ( $χ_{k}^{2}$ ): If $Z_{1}, \dots, Z_{k}$ are independent standard normal variables, then $\sum Z_{i}^{2} \sim χ_{k}^{2}$ .
Student’s t-Distribution ( $t_{k}$ ): Arises when estimating the mean of a normally distributed population with unknown variance. $T = \frac{Z}{V / k}$ where $Z \sim N (0, 1)$ and $V \sim χ_{k}^{2}$ .
F-Distribution ( $F_{d_{1}, d_{2}}$ ): The ratio of two scaled chi-squared variables; fundamental for comparing variances and ANOVA.

7. Numerical Analysis with SciPy

The following Python snippet demonstrates the verification of the transformation $Y = X^{2}$ using the scipy.stats library. If $X$ is a standard normal variable, its square follows a Chi-squared distribution with 1 degree of freedom.

import numpy as np
import matplotlib.pyplot as plt
from scipy import stats

# Generate standard normal samples
n_samples = 100000
x_samples = np.random.normal(loc=0, scale=1, size=n_samples)

# Transform: Y = X^2
y_samples = x_samples**2

# Analytical Chi-square PDF (k=1)
y_range = np.linspace(0.01, 6, 1000)
pdf_analytical = stats.chi2.pdf(y_range, df=1)

# Plotting
plt.figure(figsize=(12, 6))
plt.hist(y_samples, bins=100, density=True, alpha=0.5, label='Empirical Histogram (X^2)')
plt.plot(y_range, pdf_analytical, 'r-', lw=2, label='Analytical Chi2(df=1)')
plt.title("Verification of Variable Transformation: $X \sim \mathcal{N}(0,1) \implies X^2 \sim \chi^2_1$")
plt.xlabel("Value")
plt.ylabel("Density")
plt.legend()
plt.grid(True, alpha=0.3)
plt.show()

# Demonstrate the Exponential Memoryless Property
lambda_param = 0.5
t_survive = 2.0  # Already survived 2 units
t_additional = 1.0 # Probability of surviving 1 more unit

# P(X > t_survive + t_additional | X > t_survive)
p_cond = (1 - stats.expon.cdf(t_survive + t_additional, scale=1/lambda_param)) / \
         (1 - stats.expon.cdf(t_survive, scale=1/lambda_param))
# P(X > t_additional)
p_orig = 1 - stats.expon.cdf(t_additional, scale=1/lambda_param)

print(f"Conditional Prob: {p_cond:.5f}")
print(f"Unconditional Prob: {p_orig:.5f}")
print(f"Difference: {abs(p_cond - p_orig):.10f}")

Conceptual Check

Consider the transformation $U = X + Y$ and $V = X - Y$, where $X$ and $Y$ are independent random variables. To find the joint density $f_{U,V}(u, v)$, we calculate the Jacobian determinant of the inverse transformation. What is the value of $|\det(J)|$?

Conceptual Check

A system's time-to-failure follows an Exponential distribution. If the system has already operated without failure for $T$ hours, what is the probability it fails within the next $h$ hours?

Conceptual Check

In the context of the Multivariate Normal distribution $\mathcal{N}(\mu, \Sigma)$, what property is guaranteed if the Covariance Matrix $\Sigma$ is diagonal?

Distributions & Density Functions

Distributions and Density Functions

1. Probability Density Functions (PDF) and Cumulative Distribution Functions (CDF)

2. Catalog of Distributions

2.1 Discrete Distributions

2.2 Continuous Distributions

3. Multivariate Distributions

3.1 Marginal and Conditional Densities

3.2 Independence

4. Covariance and Correlation

5. Transformation of Random Variables

5.1 Univariate Transformation

5.2 Multivariate Transformation and the Jacobian

6. Sampling Distributions

7. Numerical Analysis with SciPy

Consider the transformation $U = X + Y$ and $V = X - Y$, where $X$ and $Y$ are independent random variables. To find the joint density $f_{U,V}(u, v)$, we calculate the Jacobian determinant of the inverse transformation. What is the value of $|\det(J)|$?

A system's time-to-failure follows an Exponential distribution. If the system has already operated without failure for $T$ hours, what is the probability it fails within the next $h$ hours?

In the context of the Multivariate Normal distribution $\mathcal{N}(\mu, \Sigma)$, what property is guaranteed if the Covariance Matrix $\Sigma$ is diagonal?