Retrieve URLs and Tags from Del.icio.us (Delicious.com)
Del.icio.us is a bookmarking website. For some people who want to work on topic modeling, the website can be a good data set to try. In this post, I would like to show how to retrieve the information (e.g. URLs, title, tags, users, comments, timestamps ) from Del.icio.us. There are so many way to do the job, but, for me, I think Python API is a good and easy way to do this.
Actually the whole API is available for free and described very well in Del.icio.us Python API web page by Michael G. Noll. However, some of us who is an absolute beginner may not understand how to install and use the API, so I would like to elaborate Michael’s guide in more detail. There are 4 big steps, here we go!
- Install Python: First of all, you will have to have Python engine installed in your machine
- Install Easy Install: Easy Install is a Python package that can save us a lot of time when installing any Python package.
- Install Michael’s Del.icio.us Python API on your machine
- Run the API, and have fun!
1. Install Python on your machine Windows 7 64-bit
- Download Python engine from the Python download page. Pick the installer that matches your machine and OS. For me, I have windows7 64-bit, so I will download “Python 2.6.5 Windows X86-64 installer (Windows AMD64 / Intel 64 / X86-64 binary  — does not include source)”
- Install the file on your machine, it should not take too long to download and install it on your machine. I found a good video tutorial on the Python installation and testing.
- Now you can play with Python IDLE to see if you install it properly.
- Add path to the Python folder
- Run command line as an administrator, please refer to this blog.
- add path by typing “set path=C:\Program Files\yourPythonFolder;%path%”. Note that yourPythonFolder MUST contain the file python.exe
- You can check if the path is included properly by typing “path”
2. Install Easy Install
For simplicity, we will next install a Python package called “Easy Install” which can save a lot of our time when install additional Python package. Easy Install will monitor the installation and can automatically download and install additional package needed. This way we don’t have to manually check what package to download and install.
- Please go to Easy Install web page, and click download the proper installation file. Note that if you use Windows7 64-bit, the only option you may use is the “Source” file (setuptools-0.6c11.tar.gz). One good thing about using the source file is that it always works regardless of what kind of machine or OS you are using…so I will go this way.
- Download the source and extract the file on a folder, say C:\Users\bot\Downloads\PythonFiles\setuptools-0.6c11
- Run command line as an administrator, and go to the folder C:\Users\bot\Downloads\PythonFiles\setuptools-0.6c11
- Now we have to install the Easy Install package by running the command “python setup.py install”. You will see so many things going on in the command line.
- You will find that Easy Install would be stored in the folder “C:\Program Files\Python 2.6.5 64-bit\Scripts”
3. Install Michael’s Del.icio.us Python API
We will use Easy Install to do the job
- Stay in the command line, go to the folder “C:\Program Files\Python 2.6.5 64-bit\Scripts” by typing “cd C:\Program Files\Python 2.6.5 64-bit\Scripts”
- Run the command “easy_install DeliciousAPI“, you will see Python downloading packages necessary for DeliciousAPI.
4. Run DeliciousAPI
Now that we installed the API already, now let’s run it.
- Go to Michael’s page, copy the demo code, save it as “test.py” and put it in any folder you want.
- You can run the test.py from command line by typing “python test.py”. You should be able to see the URLs, tags, everything pop up on your command line panel!
I would like to thank my friend Rohit Manokaran for helping me with this DeliciousAPI.