Search Knowledge

© 2026 LIBREUNI PROJECT

Mathematics / Probability & Statistics

Distributions & Density Functions

Distributions and Density Functions

Probability theory provides the rigorous framework for modeling uncertainty in physical, social, and information systems. At the heart of this framework lie random variables and the functions that describe their probabilistic behavior. This lesson details the characterization of random variables, the catalog of fundamental distributions, and the calculus of transformations.

1. Probability Density Functions (PDF) and Cumulative Distribution Functions (CDF)

A Random Variable is a measurable function mapping the sample space to the real line. The behavior of is fully characterized by its Cumulative Distribution Function (CDF):

The CDF is non-decreasing, right-continuous, and satisfies and . For a Continuous Random Variable, the Probability Density Function (PDF) exists such that:

The fundamental properties of a PDF include:

  1. (Non-negativity)
  2. (Normalization)

The probability that falls within an interval is given by:

2. Catalog of Distributions

2.1 Discrete Distributions

Discrete distributions model variables that take values in a countable set.

Binomial Distribution (): Models the number of successes in independent Bernoulli trials with success probability . Expected Value: ; Variance: .

Poisson Distribution (): Models the number of arrivals in a fixed interval given a constant average rate . The Poisson distribution is often used as a limit of the Binomial distribution as and with .

2.2 Continuous Distributions

Gaussian (Normal) Distribution (): The most significant distribution in statistics due to the Central Limit Theorem (CLT), which asserts that the sum of independent and identically distributed (i.i.d.) random variables converges to a normal distribution.

Exponential Distribution (): Models the time between events in a Poisson process. It is the only continuous distribution with the memoryless property: PDF: for .

Gamma Distribution: Generalizes the exponential distribution. models the time until the -th event.

Beta Distribution: Defined on , making it ideal for modeling probabilities or proportions.

3. Multivariate Distributions

In many applications, we track multiple random variables simultaneously. The Joint PDF describes their simultaneous behavior.

3.1 Marginal and Conditional Densities

The Marginal Density of is obtained by integrating out :

The Conditional Density of given is:

3.2 Independence

Random variables and are independent if and only if their joint density factors into their marginals: This implies that ; knowing provides no information about .

4. Covariance and Correlation

Covariance measures the joint variability of two variables:

For a random vector , we define the Covariance Matrix : Where . is always symmetric and positive semi-definite. If are independent, is diagonal.

5. Transformation of Random Variables

When we define a new variable , we must determine its density .

5.1 Univariate Transformation

If is strictly monotonic and differentiable:

5.2 Multivariate Transformation and the Jacobian

For a vector transformation , let be the inverse mapping. The joint density follows: where is the Jacobian Matrix of the inverse transformation:

6. Sampling Distributions

Statistical inference relies on the distributions of sample statistics.

  1. Chi-squared Distribution (): If are independent standard normal variables, then .
  2. Student’s t-Distribution (): Arises when estimating the mean of a normally distributed population with unknown variance. where and .
  3. F-Distribution (): The ratio of two scaled chi-squared variables; fundamental for comparing variances and ANOVA.

7. Numerical Analysis with SciPy

The following Python snippet demonstrates the verification of the transformation using the scipy.stats library. If is a standard normal variable, its square follows a Chi-squared distribution with 1 degree of freedom.

import numpy as np
import matplotlib.pyplot as plt
from scipy import stats

# Generate standard normal samples
n_samples = 100000
x_samples = np.random.normal(loc=0, scale=1, size=n_samples)

# Transform: Y = X^2
y_samples = x_samples**2

# Analytical Chi-square PDF (k=1)
y_range = np.linspace(0.01, 6, 1000)
pdf_analytical = stats.chi2.pdf(y_range, df=1)

# Plotting
plt.figure(figsize=(12, 6))
plt.hist(y_samples, bins=100, density=True, alpha=0.5, label='Empirical Histogram (X^2)')
plt.plot(y_range, pdf_analytical, 'r-', lw=2, label='Analytical Chi2(df=1)')
plt.title("Verification of Variable Transformation: $X \sim \mathcal{N}(0,1) \implies X^2 \sim \chi^2_1$")
plt.xlabel("Value")
plt.ylabel("Density")
plt.legend()
plt.grid(True, alpha=0.3)
plt.show()

# Demonstrate the Exponential Memoryless Property
lambda_param = 0.5
t_survive = 2.0  # Already survived 2 units
t_additional = 1.0 # Probability of surviving 1 more unit

# P(X > t_survive + t_additional | X > t_survive)
p_cond = (1 - stats.expon.cdf(t_survive + t_additional, scale=1/lambda_param)) / \
         (1 - stats.expon.cdf(t_survive, scale=1/lambda_param))
# P(X > t_additional)
p_orig = 1 - stats.expon.cdf(t_additional, scale=1/lambda_param)

print(f"Conditional Prob: {p_cond:.5f}")
print(f"Unconditional Prob: {p_orig:.5f}")
print(f"Difference: {abs(p_cond - p_orig):.10f}")
Conceptual Check

Consider the transformation $U = X + Y$ and $V = X - Y$, where $X$ and $Y$ are independent random variables. To find the joint density $f_{U,V}(u, v)$, we calculate the Jacobian determinant of the inverse transformation. What is the value of $|\det(J)|$?

Conceptual Check

A system's time-to-failure follows an Exponential distribution. If the system has already operated without failure for $T$ hours, what is the probability it fails within the next $h$ hours?

Conceptual Check

In the context of the Multivariate Normal distribution $\mathcal{N}(\mu, \Sigma)$, what property is guaranteed if the Covariance Matrix $\Sigma$ is diagonal?