## Resources

Vandenberghe Lectures

## General derivation

The Gaussian probability density function of $\mathbf{y}_i = \mathbf{w}^{T}\mathbf{x}$ is: $p(\mathbf{y}_i|\mathbf{x};\boldsymbol{\theta}) = \frac{1}{(2\pi)^{D/2}|\boldsymbol{\Sigma}|^{1/2}} \exp\Bigg(-\frac{1}{2} (\mathbf{y}_i - \mathbf{w}^{T}\mathbf{x})^{T}\boldsymbol{\Sigma}^{-1}(\mathbf{y}_i - \mathbf{w}^{T}\mathbf{x}) \Bigg)$

which is the same the likelihood of a single data point.

For a data set of points weighted by values $\boldsymbol{\alpha}$ the log-likelihood is defined as follows:

Simplifying first for the case with no $\boldsymbol{\alpha}$:

Simplifying first for the case with $\boldsymbol{\alpha}$: