GMM-BIC: A simple way to determine the number of Gaussian components
Gaussian Mixture Models (GMMs) are very popular in broad area of applications because its performance and its simplicity. However, it is still an open problem on how to determine the number of Gaussian components in a GMM. One simple solution to this problem is to use Bayesian Information Criteria (BIC) to penalize the complexity of the GMM. That is, the cost function of BIC-GMM is composed of 2 parts: 1) log-likelihood and 2) complexity penalty term. Consequently, the final GMM would be a model that can fit the data well, but not “overfitting” the model in BIC sense. There are tons of tutorials on the internet. Here I would like to share my MATLAB code for demo.
Note that Variational Bayes GMM (VBGMM) can also solve this problem in a different flavor and is worth to study and compare with GMM-BIC. I also provided some details of the derivations of VBGMM here.