Home > Academics, Research > What make a covariance matrix NOT positive definite in the EM algorithm?

What make a covariance matrix NOT positive definite in the EM algorithm?

There are so many plausible reasons. One common reason is that there is at least one Gaussian component not having its cluster members in a close affinity. This situation occurs when the data clusters spread very narrow with respect to the distance between each cluster; in other words, when the intra-cluster distance is much smaller than inter-cluster distance. Let’s assume we have 3 data clusters A, B and C, with A and B are almost merged to each other and very far away from C. We want to cluster the data into 3 components using the EM algorithm.  Suppose the initial locations of the 3 clusters are at the middle of the space among the three clusters, and it occurs that there is one centroid not having its “nearest” members. This also means that it is quite sufficient to use only 2 components to model the whole data rather than 3. Let’s assume the deserted centroid is labeled by the ID ‘2’. In which case, the posterior marginal distribution of each data sample will either have big value for label 1 or 3, but there is no sample give big value for label 2. In fact, to be more precise, the posterior marginal for the label 2 will be virtually zero for all the data samples. Unfortunately the update equation for a covariance matrix weights each atom (i.e., $(x_i-\mu_2)(x_i-\mu_2)^{\top}$)  of updated covariance matrix with its corresponding class posterior marginal $p(x_i=c_2|evidence)$, and hence give zero matrix for covariance matrix of class label 2. So, as you have seen, it is not always an easy case to use EM to cluster the really-far-separated data.