### Archive

Archive for October, 2010

## Variational Bayesian Gaussian Mixture Model (VBGMM)

EM-algorithm for Mixture of Gaussian (EMGMM) has been a very popular model in statistics and machine learning. In EMGMM, the model parameters are still estimated using maximum likelihood (ML) method, however, recently there has been a need to put the prior probability on the model parameters. So, the GMM becomes a hierarchical Bayesian model whose root layer to leaf layer are the parameters, the mixture proportion and the observation respectively. Originally this hierarchical model can be infered using some challenging integration techniques or stochastic sampling techniques (e.g. MCMC). The latter case takes a lot of computational time to sample from the distribution.

Fortunately, there is an approximation technique that can help to make fast inference and give a good  approximate solution for example mean-field variational approximation. The variational approximation is very well explained in chapter9 of the classic machine learning textbook [1] by Bishop. There is a very good example on the variational Bayesian Gaussian mixture models. In fact, Bishop did a great job on explaining and deriving VBGMM, however, for a beginner, the algebra of the derivation can be challenging. Since the derivation contains a lot of interesting things that can be applied to other variational approximations, and the text skipped some details, so I decided to “fill in” the missing part and make a derivation tutorial out of it which is available here [pdf]. I also made the details of the derivation of some examples prior to VBGMM section in the text [1] available as well [pdf]. Originally, VBGMM is firstly appeard in an excellent paper [2]. Again, for the introduction and for more detail on the interpretation of the model, please refer to the original paper or Bishop’s textbook.

Implementation of the VBGMM in MATLAB can be found at Prof. Kevin Murphy’s group in UBC  [link] or [link]. The code requires Netlab toolbox (a bunch of very good MATLAB codes for machine learning).

[1] “Pattern Recognition and Machine Learning” (2006) by Christopher Bishop [link]
[2] “A Variational Bayesian Framework for Graphical Models” (NIP99) by Hagai Attias [link]

## Gaussian Mean Shift Clustering

I have heard a lot of about Gaussian mean shift clustering as it is very popular, easy to implement, and one of the best for object tracking in video. Recently I have been working on structure learning, and one of my recent work is to say (in fact, boast, wahahahaha…just kidding) that structure learning can be a more general case of those classic clustering algorithms like k-mean, LBG-vector quantization, mean shift and EM algorithm. So I have to find some algorithms to compare with mine. The algorithm that I want to compare with has to come up with the number of clusters automatically once the covariance matrix or kernel width is known, so I think about LBG-VQ and Gaussian mean shift (GMS). When deriving GMS, I found that it’s very similar to the mean update equation of EM algorithm except that in EM we try to estimate the $\mu$, so we take derivative w.r.t. $\mu$. Unlike EM, in GMS, we take derivative w.r.t. each sample point in the space. However, in Gaussian distribution we have the term $(x_i-\mu)$ whose derivative $\frac{\partial}{\partial\mu}$ and $\frac{\partial}{\partial x_{i}}$ have opposite sign, therefore when equating the derivative to zero, we will get similar update equation.

the update equation as $x^{(t+1)}=\frac{\sum_{i=1}^{N}\mathcal{N}(x^{(t)}|x_{i},\Lambda^{-1})x_{i}}{\sum_{i'=1}^{N}\mathcal{N}(x^{(t)}|x_{i'},\Lambda^{-1})}\label{eq:update_mean_shift}$

which very similar to mean update in EM algorithm for mixture of Gaussian.

Here is a short report [pdf] on GMS which shows the preliminary results and how to derive GMS. MATLAB codes for GMS are available here.