## Hand posture recognition using minimum divergence classifier

I and my colleague were suggested by a reviewer to apply our accepted work on some real-world application. “Bro, we’ve got less than 4 days to apply our work on a real-world problem…what would we do?”, we spent 10 minutes discussing several possible problems such as automatic video segmentation, CD cover searching, human gesture recognition and some other funny-crazy ideas. Finally, with our curiosity and the time constraint we ended up with static hand posture recognition. Fortunately, the data set is not too difficult to find on internet. Millions thanks to Triesch and Von Der Malsburg for the wonderful hand posture database–that saved our lives.

Originally we found that calculating divergence measure of 2 Gaussian mixture models (GMM) can be done efficiently using Cauchy-Schwarz divergence () as it gives closed-form expression for any pair of GMMs. Of course, we can’t get this awesome property in Kullback-Leibler divergence ()…why? read our paper [1] ^_^ Yay! In short, formulation does not allow Gaussian integral trick, hence closed-form expression is not possible.

In this work, we use minimum divergence classifier to recognize the hand postures. Please see our paper for more details. We had finished our experiment on the second day, so we have some time left to make a fancy plot summarizing our work which we would like to share with you below. The classification accuracy using and are 95% and 92% respectively, and the former method also gives much better computational run-time, about 10 time faster. The figures below also suggest that our proposed method outperforms when it comes to clustering as the proposed method gives more discriminative power.

[1] K. Kampa, E. Hasanbelliu and J. C. Principe, “Closed-form Cauchy-Schwarz pdf Divergence for Mixture of Gaussians,”* Proc. of the International Joint Conference on Neural Networks* (IJCNN 2011). [pdf] [BibTex]

We make our code available for anyone under creative commons agreement [.zip]

We also collected some interesting links to the hand posture/gesture database here:

http://www.datehookup.com/content-analyzing-body-language-gesture-recognition.htm

http://www-prima.inrialpes.fr/FGnet/data/03-Pointing/index.html#Gesture%20Vocabulary

http://www.idiap.ch/resource/gestures/

http://www.iis.ee.ic.ac.uk/~tkkim/ges_db.htm

ftp://mi.eng.cam.ac.uk/pub/CamGesData/

http://www.csc.kth.se/~danik/gesture_database/

The following papers and documents can be helpful:

A Bimodal Face and Body Gesture Database for Automatic Analysis of Human Nonverbal Affective Behavior

Hatice Gunes and Massimo Piccardi Computer Vision Research Group,

University of Technology, Sydney (UTS)

A Color Hand Gesture Database for Evaluating and Improving Algorithms on Hand Gesture and Posture Recognition

FARHAD DADGOSTAR, ANDRE L. C. BARCZAK, ABDOLHOSSEIN SARRAFZADEH

Hand Detection and Gesture Recognition using ASL Gestures

Supervisor: Andre L. C. Barczak

Student: Dakuan CUI

Massey University

Mike, when I moved to this new site, somehow your comments specifically for this post did not come along -_-“. Your comments raise very good and important issues that I would like to discuss too, so, please allow me to manually re-post them here.

————————————————————–

Bot,

Nice work and congrats on the paper – can’t wait to read it! What is the paper title and what journal can I expect to see it in?

Also, I find your results very interesting. In some contexts, such as density bandwidth estimation, the minimization of Dkl can be shown to be equivalent to maximizing the log likelihood of the data given your parametric on non-parametric density form. Although I’m not aware of the finder details of your work, it appears that you have a true and estimated GMM and you are attempting to minimize the estimated based on Dkl or Dcs – correcT? Therefore, I was curious if there are any interesting parallels that can be drawn between your Dcs minimization method and maximum likelihood? – or is attempting to draw such a comparison even appropriate in this case might be the better question.

Furthermore, I’ve seen some work where people use Jensen-Renyi’s divergence form to get at closed form solutions for Gaussian based pdfs. How does Dcs minimization differ to JR divergence approaches or is it equivalent? thank you

Mike, yes you are very right about that. I think I remember your work when you estimate the optimal non-parametric pdf bandwidth using Dkl, and you use Taylor’s series to approximate the thing. The approach you mentioned earlier has a very strong connection to a class of approximate inference called “variational approximation inference” where Dkl is used to measure the difference between your proposed parametric distribution and the true distribution which is complex.

In this work, I just show that Dcs can give closed-form expression for divergence of GMMs. Essentially I got this by very simple observations: 1) I know why Dkl does not give a closed-form expression and 2) So I know what kind of functional form would give a closed-form expression and 3) Gaussian integral is so well-known.

“I was curious if there are any interesting parallels that can be drawn between your Dcs minimization method and maximum likelihood?” This is a great question to ask, To be honest, I’m not 100% sure myself, but I will give it a try. I thought about this when I worked on variational inference, and I found that Dcs is similar but not equivalent to the log-likelihood of the model. If we think Dkl can be interpreted as the expectation of log-likelihood plus some constrains, then Dcs can be interpreted as something similar but not the same as the expectation. That because in Dcs, all the pdf stay within the logarithm, therefore, it’s hard to interpret that as the expectation of log-likelihood. Hence, minimizing Dcs will lead you to maximizing some criteria different from but similar to log-likelihood.

JR divergence, if I remember right, can calculate the divergence between multiple pdfs, whereas traditional Dcs can do only 2 pdf at time. The functional forms of both divergences share some similarity, yet significantly different, so the minimization is different. One problem of Dcs I can envision is the square root at the denominator, this will give a hard time when taking derivative.

I apologize in advance as this is probably a dumb question. The closed form equation, Equation 3 in your paper, looked simple enough to implement so I did. However, I had an issue where the divergence of a GMM with itself was not 0. I looked at your code to see what I did wrong and it looks to me like your code differs from the equation in the paper. Can you explain why your implementation differs? By the way, thank you very much for providing code.

Thanks for your question, William. I will look into the code when I have time.