|
In probability theory and statistics, a multivariate normal distribution, also sometimes called a multivariate Gaussian distribution (in honor of Carl Friedrich Gauss, who was not the first to write about the normal distribution) is a specific probability density function.
General case
A random vector X = (X1, ..., Xn) follows a multivariate normal distribution, also sometimes called a multivariate Gaussian distribution (in honor of Carl Friedrich Gauss, who was not the first to write about the normal distribution), if it satisfies the following equivalent conditions:
- there is a random vector Z=(Z1, ..., Zm), whose components are independent standard normal random variables, a vector μ = (μ1, ..., μn) and an n×m matrix A such that X = A Z + μ.
- φX(u)=exp(iμTu − (½) uT Γ u).
The following is not quite equivalent to the conditions above, since it fails to allow for a singular matrix as the variance:
- there is a vector μ=(μ1, ..., μn) and a symmetric, positive definite matrix <math>\Sigma<math> such that X has density
- <math>
f_X(x_1,\ldots,x_n)\, dx_1\ldots dx_n=
\frac{1}{(2\pi)^{n/2}|\Sigma|^{1/2}}
\exp\left(-\frac{1}{2}({\mathbf x}-{\mathbf\mu})^T{\mathbf\Sigma}^{-1}({\mathbf x}-{\mathbf\mu})
\right)dx_1\ldots dx_n
<math>
where <math>\left|A\right|<math> is the determinant of <math>A<math>. Note how the equation above reduces to that of the univariate normal distribution if <math>\Sigma<math> is a <math>1\times 1<math> matrix (ie a real number).
The vector μ in these conditions is the expected value of X and the matrix <math>{\mathbf\Sigma}={\mathbf A}{\mathbf A}^T<math> is the covariance matrix of the components Xi.
It is important to realize that the covariance matrix must be allowed to be singular. That case arises frequently in statistics; for example, in the distribution of the vector of residuals in ordinary linear regression problems.
Note also that the Xi are in general not independent; they can be seen as the result of applying the linear transformation A to a collection of independent Gaussian variables Z.
Bivariate case
In the 2-dimensional nonsingular case, the probability density function is
- <math>f(x,y) = {1 \over 2 \pi \sigma_x \sigma_y \sqrt{1-\rho^2}} \exp (-(x^2/\sigma_x^2 + y^2/\sigma_y^2 - 2 \rho x y/(\sigma_x\sigma_y))/2(1-\rho^2)),\,<math>
where ρ is the correlation between X and Y.
Linear transformation
If Y = BX is a linear transformation of X where B is an m×p matrix then Y has a multivariate normal distribution with expected value Bμ and variance BΣBT.
Corollary: any subset of the Xi has a marginal distribution that is also multivariate normal. To see this consider the following example: to extract the subset (X1, X2, X4)T, use
- <math>
{\mathbf B}=
\begin{bmatrix}
1 & 0 & 0 & 0 & 0 & \ldots & 0\\
0 & 1 & 0 & 0 & 0 & \ldots & 0\\
0 & 0 & 0 & 1 & 0 & \ldots & 0
\end{bmatrix}
<math>
which extracts the desired elements directly.
Conditional distributions
Then if <math>{\mathbf\mu}<math> and <math>{\mathbf\Sigma}<math> are partitioned as follows
- <math>
{\mathbf\mu}=\left(\begin{matrix}
{\mathbf\mu}_1\\
{\mathbf\mu}_2
\end{matrix}
\right)
\qquad
{\mathbf\Sigma}=
\begin{bmatrix}
{\mathbf\Sigma}_{11} &
{\mathbf\Sigma}_{12} \\
{\mathbf\Sigma}_{21} &
{\mathbf\Sigma}_{22}
\end{bmatrix}
<math>
then the distribution of <math>{\mathbf x}_1<math> conditional on <math>{\mathbf x}_2={\mathbf a}<math> is multivariate normal with mean
- <math>
{\mathbf\mu}_1+{\mathbf\Sigma}_{12}{\mathbf\Sigma}_{22}^{-1}\left({\mathbf a}-{\mathbf\mu}_2\right)<math>
and covariance matrix
- <math>
{\mathbf\Sigma}_{11}-
{\mathbf\Sigma}_{12}
{\mathbf\Sigma}_{22}^{-1}
{\mathbf\Sigma}_{21}.
<math>
This matrix is the Schur complement of <math>{\mathbf\Sigma_{22}}<math> in <math>{\mathbf\Sigma}<math>.
Note that knowing the value of <math>{\mathbf x}_2<math> to be <math>{\mathbf a}<math> alters the variance; perhaps more suprisingly, the mean is shifted by <math>{\mathbf\Sigma}_{12}{\mathbf\Sigma}_{22}^{-1}\left({\mathbf a}-{\mathbf\mu}_2\right)<math>; compare this with the situation of not knowing the value of <math>{\mathbf a}<math>, in which case <math>{\mathbf x}_1<math> would have distribution
<math>N_q\left({\mathbf\mu}_1,{\mathbf\Sigma}_{11}\right)<math>.
The matrix <math>{\mathbf\Sigma}_{12}{\mathbf\Sigma}_{22}^{-1}<math> is known as the matrix of regression coefficients.
Estimation of parameters
The derivation of the maximum-likelihood estimator of the covariance matrix of a multivariate normal distribution is perhaps surprisingly subtle and elegant. See estimation of covariance matrices.
|