Archive for the ‘Research’ Category

A good Introduction on MapReduce

MapReduce is a framework to efficiently process a task that can be parallelized using cluster or grid. A good introduction can be found in the link below.

In a sense, MapReduce framework is very similar to message-passing algorithm in graphical models where the Map and Reduce are comparable to building (tree) structure and marginalization of the messages respectively. So, I think MapReduce can make an inference plausible for large-scale graphical models.


Neuroscience talks

April 3, 2012 Leave a comment

Information theory, pattern recognition, and neural networks
Draft videos (not yet edited):

Categories: Academics, Research

Awesome seminars at UW

April 3, 2012 1 comment

There are some fascinating seminars sponsored by UW, and most of them are recorded:

CSE Colloquia:
Every Tuesday 3:30 pm

Yahoo! Machine Learning Seminar
Every Tuesday from 12 – 1 pm

UWTV: Research/Technology/Discovery Channel
Broadcast all the new findings, research, technology for free!!




Install MATLAB r2010a on Ubuntu 10.04

January 13, 2012 Leave a comment

Here is the step by step how to install MATLAB on Ubuntu

  1. Mount the matlab iso file. Let’s say the matlab installation files are in directory /tmp/mat2010a
  2. First, install using
    $ sudo sh install
    However, you might get the error, which looks like this
    ——————————————————————-    An error status was returned by the program ‘xsetup’,
        the X Window System version of ‘install’. The following
        messages were written to standard error:

            /home/bot/tmp/matu20Xa/update/install/ 178: /home/bot/tmp/mat2010a/update/bin/glnxa64/xsetup: Permission denied

        Attempt to fix the problem and try again. If X is not available
        or ‘xsetup’ cannot be made to work then try the terminal
        version of ‘install’ using the command:

                install* -t    or    INSTALL* -t

    The problem occurs because of the permission of the file …/xsetup is not set properly. So, the easy way is to go to the directory and change the permission by using the command

    ../glnxa64$ chmod 777 xsetup

    Now, you can go back to the normal installation

  3. Next step, create a root matlab folder, and it is suggested that you create the folder in/usr/local/matlabR2010a

    by using the command line

    sudo mkdir /usr/local/matlabR2010a
  4. The rest is can do it yourself

Additional reading:

Categories: Research, Tutorials Tags: ,

Cluster Evaluation using Adjusted Rand Index (ARI)

August 17, 2011 Leave a comment

Here is the 2 partitions mentioned in the example1 in the tutorial paper “Details of the Adjusted Rand index and Clustering algorithms
Supplement to the paper “An empirical study on Principal Component Analysis for clustering gene expression data” (to appear in Bioinformatics)” pdf

Partition U (ground truth) and V (predicted)

And I think they did in the example is exactly the same as the following

a = |(4,5) ; (7,8)  (7,9) (7,10) (8,9) (8,10) (9,10)| = (2 choose 2) + (4 choose 2) =  7

b=|(1,2) (3,4) (3,5) (6,4) (6,5) (3,6)| = 6

c = |(1,3) (2,4) (2,5) (6,7) … (6,10)| = 7

d = |(1,4)…(1,10) (2,3) (2,6) …(2,10) (3,7) …(3,10) (4,7)…(4,10) (5,7)…(5,10)| = 25

where (i,j) denotes the pair (or edge) between node i and node j. Then they use this a, b, c and d to evaluate Rand index and adjusted Rand index.

How to remove white-border from a figure?

August 5, 2011 Leave a comment

When adding a figure to your publication, you might want to remove the undesired white-border off your figures. I believe that the best way is to create figures without the border if it is possible. In MATLAB, I think you can do so. However, if you have the figures already, you might want to have a program to remove the borders automatically, wisely and controllably. I developed a toolbox in MATLAB for this purpose. Please refer to the URL below.

The overview of white-border removal toolbox

Effects of adding loading factors to a covariance matrix

July 29, 2011 Leave a comment

From my previous post, we know that the update equation for covariance matrix might not be numerically stable because of the matrix not being positive definite. An easy way to stabilize the algorithm is to add a relatively small positive number a.k.a. loading factor to the diagonal entries of the covariance matrix. But, Does the factor loading affect the likelihood or the convergence of the EM algorithm?

Apparently, adding the loading factor to the covariance matrix does impact the log-likelihood value. I made some experiments on the issue, and let me share the results with you as seen in the learning curve (log-likelihood curve) of ITSBN with EM algorithm below. The factor is applied to the matrix only when the determinant of the covariance matrix is smaller than 10^{-6}. There are 5 different factors used in this experiment listed as follows; 10^{-8}, 10^{-6}, 10^{-4}, 10^{-3}, 10^{-2}. The results show that the learning curves are still monotonically increasing* and level off near the end. Furthermore, we found that the level-off value are highly associated with the value of the factor. The bigger the factor, the smaller the level-off value. This suggested that we should pick smallest value of factor as possible in order to stay as close as the ideal learning curve as possible. Note that the loading factor is not added to the covariance matrix until the second iteration.

log-likelihood curve with different loading factors

* Though I don’t think this is always the case because the factor is not consistently added to the matrix, and hence when it is added, it might pull the log-likelihood up to a low value. However, it is empirically shown that the log-likelihood is still monotonically increasing when the factor is big.