Gaussian distributions

Univariate Gaussian
Multivariate Gaussian
Conditional distributions
Marginal distribution
Conjugate prior for Gaussian distributions with different unknown parameters
Maximum-likelihood estimates
Moment-Generating Function (MGF)
Sample statistics

Put univariate and multivariate Gaussians together for comparison:

Univariate Gaussian

\[\begin{align} p(x) &= \frac{1}{\sqrt{2 \pi \sigma^2}} \exp \left \{- \frac{1}{2} \frac{(x - \mu)^2}{\sigma^2} \right \} \\ &= \frac{1}{\sqrt{2 \pi \sigma^2}} \exp \left \{- \frac{1}{2} (x - \mu)\frac{1}{\sigma^2}(x - \mu) \right \} \\ \end{align}\]

Multivariate Gaussian

\[\begin{equation} p(\mathbf{x}) = \frac{1}{\sqrt{(2 \pi)^D |\Sigma|}} \exp \left \{ - \frac{1}{2} (\mathbf{x} - \boldsymbol{\mu})^T \Sigma^{-1} (\mathbf{x} - \boldsymbol{\mu}) \right \} \end{equation}\]

If in terms of precision ($\Lambda$):

\[\begin{equation} p(\mathbf{x}) = (2\pi)^{-\frac{D}{2}} |\Lambda|^{\frac{1}{2}} \exp \left \{ - \frac{1}{2} (\mathbf{x} - \boldsymbol{\mu})^T \Lambda(\mathbf{x} - \boldsymbol{\mu}) \right \} \end{equation}\]

Conditional distributions

Given

\[\begin{align} \mathbf{x} &= \begin{bmatrix} \mathbf{x}_a \\ \mathbf{x}_b \end{bmatrix} \\ \boldsymbol{\mu} &= \begin{bmatrix} \boldsymbol{\mu}_a \\ \boldsymbol{\mu}_b \end{bmatrix} \\ \Sigma &= \begin{bmatrix} \Sigma_{aa} & \Sigma_{ab} \\ \Sigma_{ba} & \Sigma_{bb} \end{bmatrix} \\ \Lambda &= \begin{bmatrix} \Lambda_{aa} & \Lambda_{ab} \\ \Lambda_{ba} & \Lambda_{bb} \end{bmatrix} = \Sigma^{-1} \\ \end{align}\]

Note, $\Sigma$ is the covariance matrix and $\Lambda$ is its inverse, called the precision matrix. Also note that the partitioned matrices of $\Sigma$ and $\Lambda$ are note inverse of each other, i.e. $\Sigma_{ij} \ne \Lambda_{ij}^{-1}$.

After some algebra, the conditional mean and variance expressed in terms of partitioned precision matrices:

\[\begin{align} \boldsymbol{\mu}_{a|b} &= \boldsymbol{\mu}_a - \Lambda_{aa}^{-1} \Lambda_{ab} (\mathbf{x}_b - \boldsymbol{\mu}_b) \\ \Sigma_{a|b} &= \Lambda_{aa}^{-1} \\ \end{align}\]

Alternatively, these can be also expressed in terms of partitioned covariance matrices, which is a bit more complex for $\Sigma_{a|b}$:

\[\begin{align} \boldsymbol{\mu}_{a|b} &= \boldsymbol{\mu}_a + \Sigma_{ab} \Sigma_{bb}^{-1} (\mathbf{x}_b - \boldsymbol{\mu}_b) \\ \Sigma_{a|b} &= \Sigma_{aa} - \Sigma_{ab} \Sigma_{aa}^{-1} \Sigma_{ba} \end{align}\]

Note $\boldsymbol{\mu}_{a|b}$ is a linear function of $\mathbf{x}_b$.

so $p(\mathbf{x}_a|\mathbf{x}_b) \sim \mathcal{N}(\boldsymbol{\mu}_{a|b},\Sigma_{a|b})$.

Marginal distribution

\[\begin{align} \mathbb{E}[\mathbf{x}_a] &= \boldsymbol{\mu}_a \\ \text{cov}[\mathbf{x}_a] &= \Sigma_{aa} \\ \end{align}\]

i.e. $p(\mathbf{x}_a) \sim \mathcal{N}(\mathbf{x}_a | \boldsymbol{\mu}_a, \Sigma_{aa})$, which is straightforward.

Conjugate prior for Gaussian distributions with different unknown parameters

mean	variance/precision	dimension	conjugate prior
unknown	known	univariate	univarite Gaussian distribution
unknown	known	multivariate	multivariate Gaussian distribution
known	unknown	univariate	Gamma distribution
known	unknown	multivariate	Wishart distribution (Multivariate gamma distribution)
unknown	unknown	univariate	Gaussian-gamma distribution
unknown	unknown	multivariate	Gaussian-Wishart distribution

Maximum-likelihood estimates

See this notebook.

Moment-Generating Function (MGF)

We derive the MGF of $\mathcal{N}(\mu, \sigma^2)$ as shown below.

\[\begin{align} M_X(\lambda) &= \mathbb{E}[e^{\lambda X}] \\ &= \int \exp(\lambda x) \frac{1}{\sqrt{2\pi \sigma^2}} \exp \left[ -\frac{(x - \mu)^2}{2\sigma^2}\right ] dx \\ &= \frac{1}{\sqrt{2\pi \sigma^2}} \int \exp\left[ -\frac{1}{2\sigma^2} \left(x^2 - 2\mu x + \mu^2 - 2\sigma^2 \lambda x \right ) \right ] dx \\ &= \frac{1}{\sqrt{2\pi \sigma^2}} \int \exp\left[ -\frac{1}{2\sigma^2} \left( \left(x - \mu - \sigma^2 \lambda \right )^2 - \left( \mu + \sigma^2 \lambda \right )^2 + \mu^2 \right ) \right ] dx \\ &= \exp\left(\frac{\sigma^4 \lambda^2 + 2\mu \sigma^2 \lambda}{2\sigma^2} \right) \frac{1}{\sqrt{2\pi \sigma^2}} \int \exp\left[ -\frac{1}{2\sigma^2} \left(x - \mu - \sigma^2 \lambda \right )^2 \right ] dx \label{eq:factored} \\ &= \exp\left( \frac{\sigma^2 \lambda^2}{2} + \mu\lambda \right) \end{align}\]

Note, in Eq. \eqref{eq:factored}, $\frac{1}{\sqrt{2\pi \sigma^2}} \int \exp\left[ -\frac{1}{2\sigma^2} \left(x - \sigma^2 \lambda \right )^2 \right ]$ is the integration of the distribution of $\mathcal{N}(\mu + \sigma^2 \lambda, \sigma^2)$, so it equals 1.

Therefore,

for $\mathcal{N}(0, 1)$, $M_X(\lambda) = \exp \frac{\lambda^2}{2}$.
for $\mathcal{N}(0, \sigma^2)$, $M_X(\lambda) = \exp \frac{\sigma^2 \lambda^2}{2}$.

Sample statistics

Suppose $X \sim \mathcal{N}(\mu, \sigma^2)$, then given i.i.d. $X_1, \cdots, X_n$, the sample statistics have the following distribution:

Sample mean $\bar{X}_n \sim \mathcal{N}(\mu, \frac{\sigma^2}{n})$, and
Standardize sample mean: $\frac{\bar{X} - \mu}{\sigma / \sqrt{n}} \sim \mathcal{N}(0, 1)$
Sample variance: $\frac{(n - 1) S_n^2}{\sigma^2} \sim \chi_{n-1}^2$, where $S_n^2 = \frac{1}{n - 1} \sum_{i=1}^n (X_i - \bar{X})^2$.