Thursday, 26 June 2008

Matlab, Python, and a Video

I've been using Matlab extensively for probably almost 10 years. I have written more lines of code in Matlab than in any other language. I always have at least one Matlab application window open. I've probably generated at least a few million Matlab figures (one of my most favorite Matlab functions is close all). I've written three small toolboxes in Matlab (and all of them have actually been used by people other than me). I've told anyone who was willing to listen that I couldn't have gotten even a fraction of my work done without Matlab. In fact, 3 times in a row I convinced the places I've been working at that I needed a (non-academic) license for Matlab and several of its toolboxes. I even had a Matlab sticker on my old laptop for a long time. I frequently visited the Matlab news group and I'm subscribed to several Matlab related blogs. If I would have needed to take a single tool with me on a remote island it would have been Matlab. I guess it's fair to say I was in love with Matlab.

However, I always felt that it wasn't a perfect relationship. Matlab is expensive. Matlab is not pre-installed on the Linux machines I remotely connect to. In fact, installing Matlab on Linux is a pain (compared to how easy it is to install it on Windows). Furthermore, not everyone has access to Matlab making it harder to share code. Finally, Matlab can be rather useless when it comes to things that are not best described in terms of matrices that fit into memory, and I can't easily run Matlab code on Hadoop.

I had been playing with Python out of curiosity (wondering why everyone was liking it so much) but I guess I was too happy with Matlab to seriously consider alternatives. But then Klaas showed me how to use Python with Hadoop. Within a very short time I've started to use Python more and more for things I usually would have done in Matlab. Now I write more Python code a day than Matlab code. I still use Matlab on a daily basis, but if I had to choose between Matlab and Python, it would be a very easy choice. SciPy and related modules are wonderful. If I'd redo my PhD thesis, it wouldn't include a single line of Matlab code and instead lot's of Python code :-)

Btw, James pointed me to the following visualization showing activities and shared code of Python developers over time. This is by far the best information visualization I have seen in a very long time. I really like the idea and implementation. I wonder if something similar could be done for a piece of music where the coders are replaced with instruments, and the files are replaced with sounds.

code_swarm - Python from Michael Ogawa on Vimeo.


Jeremy said...

I took a class this past semester, where we used a C++ machine learning toolbox called FastLIB (developed by Alex Grey's group in the CS department at Georgia Tech) that is supposed to "go live" at NIPS this year. For development, it's incredibly easy - on par with MATLAB. I find that this removes the overhead issues with MATLAB (speed and memory especially).

Elias said...

That does sound like a very nice toolbox. I wonder how long it will take until it has proper Python bindings? :-)

ben said...

python + scipy +matplotlib = awesome.

I've done all the code for my two most recent publications in the python context. The only time I fire up Matlab anymore is to look at old code (your MA toolkit for instance) though this is happening less and less as I've been porting things over (GMMs and EMD for instance...)

Anonymous said...

Have you tried sage ? What are your thoughts on it ? I tried it a year ago and it seemed really cool but I never used it regularly.

Anonymous said...

could you blog about your python development environment? what IDE you use, which shell and why. what rarely known features are most important to you, etc. I am sure it would attrackt many of your visitors and ok, ok, I'm just way too curious myself ;)

Elias said...

No I haven't worked with Sage, but it looks very nice. My favorite Python module is dumbo (developed by Klaas Bosteels) but I'm also a big fan of scipy and it's submodules (stats etc) and of course numpy. The "IDE" I currently use is vim with Python syntax highlighting (wooah!) :-)

However, I'm far from being a Python expert and most of the opinions I have on Python are just a reflection of what Klaas has taught me.

Foafing your music said...

> python + scipy +matplotlib = awesome.


BTW, anyone that wants to give Python a try here's an interesting book:
Python Scripting for Computational Science

Although, I don't know why some examples use the old-fashioned '{', '}' and ';' !!!

Cheers, Oscar

Martin Gasser said...

Yeah, Python is great. It can be used in interactive sessions (ipython!), almost like Matlab, and then the created code can be directly reused in production-ready systems. Our FM4 Soundpark application is coded entirely in Python (except for the feature extraction and similarity stuff, which was done in C++ for efficiency reasons).

Regarding Python bindings to C++ libraries: There is the Boost.Python library (included with the standard Boost distribution), which makes this a breeze :-).

Bayle said...

Unrelated to Python, but Octave is an open-source numerical computation package that is almost compatible with Matlab.