## Derivation of Inference and Parameter Estimation Algorithm for Latent Dirichlet Allocation (LDA)

As you may know that Latent Dirichlet Allocation (LDA) [1] is like a backbone in text/image annotation these days. As of today (June 16, 2010) there were 2045 papers cited the LDA paper. That might be a good indicator of how important to understand this LDA paper thoroughly. In my personal opinion, I found that this paper contains a lot of interesting things, for example, modeling using graphical models, inference and parameter learning using variational approximation like mean-field variational approximation which can be very useful when reading other papers extended from the original LDA paper.

The original paper explains the concept and how to set up the framework clearly, so I wold suggest the reader to read the part from the original paper. However, for a new kid in town for this topic, there might be some difficulties to understand how to derive the formula in the paper. Therefore my tutorial paper solely focuses on how to mathematically derive the algorithm in the paper. Hence, the best way to use this tutorial paper is to accompany with the original paper. Hope my paper can be useful and enjoyable.

You can download the pdf file here.

[1] David M. Blei, Andrew Y. Ng, and Michael I. Jordan. Latent dirichlet allocation. J. Mach. Learn.

Res., 3:9931022, 2003.

Thanks Bot! I look forward to trudging through this derivation and reading the paper to hopefully understand these powerful techniques. Any thoughts on the applicability of such LDA techniques applied to terrain modeling or surface information extraction?

Mike

Thanks Mike! That’s a good question, but I’m not sure either as how this LDA would fit to those applications you mentioned. Actually this algorithm would fit to any hierarchical Bayesian modeling, so I’m thinking about, again, surface interpolation (Well, but kriging is great already). However, we can see this LDA thing as a dimensionality reduction algorithm, so perhaps we can use this to extract features from a terrain as you mentioned. If we can have assign a notion of “topic” to a terrain, for example, agricultural, rural, beach, forest, then LDA can be a good tool to do “terrain tagging”. Since you are the expert in the area, I think you may find an awesome application to it! ^_^

–Bot

great idea Bot! terrain labeling applications seem prosperous.. hmm

Thanks a lot. I’ve been searching for this for a long time. I hope your paper will be helpful to me. Thanks again. Good day!

“However, for a new kid in town for this topic, there might be some difficulties to understand how to derive the formula in the paper. Therefore my tutorial paper solely focuses on how to mathematically derive the algorithm in the paper. Hence, the best way to use this tutorial paper is to accompany with the original paper. Hope my paper can be useful and enjoyable.”

What I can not believe!

Thanks for your tutorial paper. I hope this will help me to understand the original paper which I spent lots of time on and still I haven’t conquer it. LOL! But I cannot download the tutorial through the link given above. Would you pls send me a copy or check the link? Thanks again!

Hi Craig, sorry for the missing link and thank for letting me know. Now the tutorial file is posted there already. Hope this helps. ^_^

hey bot, I still cannot open the link, sign… Would you pls check it again? Thank you.

Craig, the link works for me even a few seconds ago. Perhaps you might want to use “save as”?

well, honestly speaking, I cannot download it from my country and I’ve asked one of my friends in USA to download it for me. Thanks again Bot!

Wow, thanks for your effort. I hope the paper can help on your research. Good luck, Craig!