## Local Binary Patterns (LBP)

I came across with an interesting algorithm for extracting feature descriptors from an image or a video file. The LBP looks very simple and easy to program, but I haven’t had a chance to try it myself. A lot of people claim that it’s pretty good.

Scholarpedia gives very good and short overview of this method:

<a href="http://www.scholarpedia.org/ar

## Derivation of Inference and Parameter Estimation Algorithm for Latent Dirichlet Allocation (LDA)

As you may know that Latent Dirichlet Allocation (LDA) [1] is like a backbone in text/image annotation these days. As of today (June 16, 2010) there were 2045 papers cited the LDA paper. That might be a good indicator of how important to understand this LDA paper thoroughly. In my personal opinion, I found that this paper contains a lot of interesting things, for example, modeling using graphical models, inference and parameter learning using variational approximation like mean-field variational approximation which can be very useful when reading other papers extended from the original LDA paper.

The original paper explains the concept and how to set up the framework clearly, so I wold suggest the reader to read the part from the original paper. However, for a new kid in town for this topic, there might be some difficulties to understand how to derive the formula in the paper. Therefore my tutorial paper solely focuses on how to mathematically derive the algorithm in the paper. Hence, the best way to use this tutorial paper is to accompany with the original paper. Hope my paper can be useful and enjoyable.

You can download the pdf file here.

[1] David M. Blei, Andrew Y. Ng, and Michael I. Jordan. Latent dirichlet allocation. J. Mach. Learn.

Res., 3:9931022, 2003.

## Dimensionality reduction using graphical models

— This idea popped up when I was writing my project proposal “Automatic Algorithm for Finding Dynamic Trees Bayesian Networks structure using ITL.” I think this idea is pretty trivial but good to know in order to inspire some new ideas! 🙂

Using graphical models can reduce the dimensionality because assigning nodes with some relationships means you guess or assume the structure for those dimension on the nodes already. For example, if we would like to make image segmentation on an RGB image, we can have two approaches in comparison:

- 5D-approach: Here we extract the important features from the image which are R, G, B, x-coordinate and y-coordinate totally 5 features. That means we will have to do clustering of point(vector) in 5-D feature space! One important drawback for using high-dimensional space is that you might get very sparse points in the feature space which sometimes is not adequate to provide a good result.
- 3D-approach: Here we will use nodes encode the positions X and Y of the pixels in the image. Consequently each node will take only 3-D distribution (not 5-D). However, what we will have to pay for having 2 fewer dimensions than before is that we will have to assume the relationships among the nodes which implies the relationship in the dimension of X and Y (in disguise). Therefore, if the assumption is good, the result is OK, but if the assumption is bad, then the result will be bad too. However, in the graphical models the complexity is determined by the size of the biggest table. So the dimensionality can be huge if the causality is very complicated. But that is the different story since the graphical models is probabilistic method which, somehow, cannot avoid having these kinds of huge joint pdf.