Home > Academics, Research, Reviews, Tutorials > GMM-BIC: A simple way to determine the number of Gaussian components

GMM-BIC: A simple way to determine the number of Gaussian components

Gaussian Mixture Models (GMMs) are very popular in broad area of applications because its performance and its simplicity. However, it is still an open problem on how to determine the number of Gaussian components in a GMM. One simple solution to this problem is to use Bayesian Information Criteria (BIC) to penalize the complexity of the GMM. That is, the cost function of BIC-GMM is composed of 2 parts: 1) log-likelihood and 2) complexity penalty term. Consequently, the final GMM would be a model that can fit the data well, but not “overfitting” the model in BIC sense. There are tons of tutorials on the internet. Here I would like to share my MATLAB code for demo.

Note that Variational Bayes GMM (VBGMM) can also solve this problem in a different flavor and is worth to study and compare with GMM-BIC. I also provided some details of the derivations of VBGMM here.

Advertisements
  1. Leo
    October 13, 2011 at 2:26 pm

    HI, Bot,is this paragraph, the link for ‘I also provided some details of the derivations here’ doesn’t work, I would appreciate it if you can add the link for the details of the derivations, thank you very much.
    Leo

    • October 13, 2011 at 3:50 pm

      Thanks Leo, I just added the missing URL. –Bot

  2. Ghada
    November 2, 2011 at 11:50 pm

    Hi, please I’d like to know if this algorithm works for 8-Dimensional data? Also Is there a limit for the number of component to be tested?
    One more thing please, can you give me the name of the paper that includes the explanation of this method so I can refer to in my thesis?

    Ghada

    • November 7, 2011 at 12:48 am

      Hi Ghada,

      The code should work for 8-D data. I don’t think the number of components is a problem for BIC, but, in stead might be a problem for GMM itself. That’s because when the number of components go very high, some components would not be stable. Try it and let me know.

      Sorry, I cannot help on the name of the paper; I myself is busy finishing my thesis too :-P. I think there could be a lot of them out there discussing using BIC to determine number of segments. Good luck on your thesis, Ghada!

  1. No trackbacks yet.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: