MIR Research: May 2007

Wednesday, 30 May 2007

Last.fm and CBS

Last.fm is going to be working together with CBS, which is known for great TV shows such as CSI. Announcements will follow. Seems like the social music revolution got some serious tailwinds now.

Update: Check out what BBC says about it.

Another Update: Check out blog.last.fm. You'll also find the full story about my recent late night experience at the last.fm office.

Tuesday, 29 May 2007

Risking our lives for the social music revolution at Last.fm

This one is a bit off topic, but at least it's work related :-)

Today (Monday 28 May a bank holiday in London), around midnight I was alone at Last.fm’s office going through some comments I’ve received from reviewers (for a journal publication on computational models of similarity for drum sounds). It was nice and quiet and peaceful...

At Last.fm we have an IRC channel to communicate; we are always logged on when we are online (also when we’re not in the office). Here’s an excerpt from the IRC chat: (note that all of us have set highlights to important words such as “beer” or “pub” and some other words)

<elias> i think someone just tried to kick the last.fm office door open :-/
<elias> pub beer porn
<elias> ...
<russ> inner one or outer one?
<elias> I'm inside
<elias> there was a hell of a noise at the door
<elias> i went there, saw that part of the door closing system has come off, and there seems to be a hughe crack in the upper part, I'm not sure if that has been there before
<russ> I don't remember any huge cracks before
<elias> I guess I should have a look outside? ... are there any weapons around? :-)
<russ> this is the white one, not the black one right?
<elias> yes
<muz> :/
<russ> there's an electric drill and a pocket knife by my desk

To find out how the storied continued (obviously I have survived), check out blog.last.fm tomorrow.

Monday, 28 May 2007

Lyrics

Given the lyrics of a song, algorithms can relatively easily figure out if it’s a Christmas song, a love song, if it contains explicit lyrics, or simply just identify the language it’s sung in. Such algorithms could be used to improve music discovery, recommendation, and playlist generation.

Personally I think lyrics are one of the most important parts of a song. If I don’t like the lyrics, I won’t like the song. I’d love to have lyrics on my display while listening to a song, I’d love to be able to search or listening to music with similar lyrics, there is so much that would be so easy to do if the lyrics would be available, but they are not.

There is an interesting article in the WSJ by Jason Fry on this, which I stumbled upon via this post on the Lefstez letter (which also contains some interesting thoughts on the topic). Sometimes I just don't understand how far detached some parts of the music industry have become from the artists and their fans. And I also wonder how much longer those who are battling to enforce ancient business models will survive.

CrestMuse Webpage & Video

CrestMuse (the currently biggest MIR related research project in Japan) launched an English version of their project webpage about a month ago. I believe the best part of the web page is the CrestMuse symposium video. The video has been produced very professionally, offers a nice summary of the many aspects of the project, features a native English female narrator, and it even briefly features myself or at least the back of my head (IIRC). Thus, it is well worth the long wait for the download (250MB Mpeg4). My download window says it will take 5 hours until I the download is complete :-/

One of the impressions I got from Japanese researchers is that they are masters of using videos to communicate their work. Wouldn't it be nice if Omras2 would also create similar videos to communicate their work? :-)
And I guess SIMAC's only flaw was that we didn't have such fancy videos.

Sunday, 27 May 2007

Music Similarity: G1C Implementation

I’ve been planning to do this for over a year: The MA (Music Analysis) Toolbox for Matlab now finally includes the G1C implementation which I described in my thesis (btw, the code probably also runs in the freely available Scilab in case you don’t have Matlab). The code is packaged as required for the MIREX’06 evaluation, where the implementation was overall fastest and scored highest (but not significantly better than other submissions).

The code might be useful for those who are new to the field and just want a quick start. Btw, last October I held a presentation on music similarity which might also be helpful for starters and the best documentation and explanation of the code I can offer is my thesis.

I also hope the implementation is somehow useful for those interested in comparing their work on computational models of music similarity to work by others. I believe the best option to do so is to conduct perceptual tests similar to those I conducted for my thesis and those done for MIREX’06 (btw, I wrote some comments about the MIREX’06 evaluation here).

A much easier approach to evaluate many different algorithms is to use a genre classification scenario (assuming that pieces from the same genre are generally more similar to each other than pieces from different genres). However, this doesn’t replace perceptual tests it just helps pre-select the algorithms (and their parameters). Btw, I think it would even be interesting for those working directly on genre classification to compare G1C (combined with a NN classifier) against their genre classification algorithms.

There are lots of things to be careful about when running evaluations based on genre classes (or other tags associated with music). Most of all I think everyone should be using an artist filter: The test set and the training set shouldn’t contain music from the same artists. Some previous work reported accuracies of up to 80% for genre classification. I wouldn’t be surprised to see some of those numbers drop to 30% if an artist filter had been applied.

I first noticed the impact of an artist filter when I was doing some work on playlist generation. In particular, I noticed that songs from the same artist appeared very frequently in the top 20 most similar lists for each song, which makes sense (because usually pieces by the same artists are somehow similar). However, some algorithms which were better than others in identifying songs from the same artists did not necessarily perform better in finding similar songs from other artists. I reported the differences in the evaluation at ISMIR’05, discussed them again in my MIREX'05 submission, and later in my thesis. An artist filter was also used for the MIREX’06 evaluation. Btw, I’m thankful to Jean-Julien Aucouturier (who was one of the reviewers of that ISMIR’05 paper) for some very useful comments on that. His thesis is highly relevant for anyone working on computation models of music similarity.

Another thing to consider when running evaluations based on genre classes is to use different music collections with different taxonomies to measure overfitting. For example, one collection could be the Magnatune ISMIR 2004 training set and one could be the researcher’s private collection. It can easily happen that a similarity algorithm is overfitted to a specific music collection (I demonstrated this in my thesis using a very small collection). Although I was careful to avoid overfitting, G1C is slightly overfitted to the Magnatune collection. Thus, even if G1C outperforms an algorithm on Magnatune, the other algorithm might still be much better in general.

There’s some room for improvements of this G1C implementation in terms of numerical issues, and some parts can be coded a lot more efficiently. However, I’d recommend trying something very different. Btw, I recently noticed how much easier it is to find something that works much better when having lots of great data. I highly recommend using Last.fm’s tag data for evaluations, there’s even an API.

Thursday, 24 May 2007

Semantic Music Informatics 2.0

Oscar Celma posted the link to the slides of the talks of MIR related talks at the recent AES convention in Vienna (on the music-ir list).

Personally (with respect to my own work) I found the slides by Mark Sandler and Oscar most interesting. Mark describes in 12 slides the context and the goals of Omras2 in a nutshell. I don’t know if Mark has been the first to talk about “Music Informatics” but I really like that term. It seems to be a much better description of what I’ve been calling MIR. (Btw, the fonts on the slides are almost unreadable, but the content is worth putting up with that.)

In his talk Oscar makes several interesting points related to my own work at Last.fm. He gives examples for the limitations of the “wisdom of the crowds” and how content-based technologies can be used to push current limits a bit further. Btw, somehow related are two recent and interesting posts by Paul Lamere on artist recommendation and tags.

Monday, 21 May 2007

How much music is out there?

I tried a quick Google search but couldn’t find what I wanted. I’d like some estimates on how many distinct music tracks are currently on peoples computers worldwide, and how many will be out there in the near future. If you have any pointers to any related estimates, please let me know.

I remember when I started working on MIR research in 2001 I had a collection of 359 (digitized) tracks. I thought that was plenty. Several of the simple computations I ran back then took longer than a night on my computer. Two years later I was dealing with 3961 tracks and thought that was a “large scale” evaluation. When I conducted some of the experiments for my thesis (2005/2006) I thought 20,000 tracks was large scale. Today my colleagues and I are talking about what to do with a few million tracks.

Looking back it seems that roughly every 2 years the number of tracks increased by one order of magnitude. With a bit of extrapolation this leads to more than a hundred million tracks in 4 years. That might sound like a lot if you think of music in terms of what is played on radio stations. But if you think of it in terms of all the music that is created every day, it seems odd that only such a tiny fraction is digitally available.

Saturday, 19 May 2007

ISMIR Reviewing

The ISMIR review deadline is on Monday. On the website they have a nice list showing how much of their work the reviewers have already completed. Next to each reviewer's name there are some smilies (see picture on the right). I got all of mine done, so I got 3 smiles. There are still a large number of reviewers who seem to be waiting until the last moment before submitting their reviews. I hope the acceptance notification is not delayed because of them.

My impression from the papers that I reviewed is that some authors don't seem very concerned about evaluations. In some areas there are currently so many papers that it is impossible to really read all of them. If a paper doesn't offer an evaluation that helps understand how it relates to previous work (and there is lots of previous work on the topics I've reviewed) then I have a very hard time motivating myself to read it :-/

Thursday, 17 May 2007

Good News

Recently the following items have been posted on the music-ir list which I found interesting but haven't had time yet to blog about.

Mark Sandler announced that the Centre for Digital Music (C4DM) has released Soundbite. It's a free playlist generation software which everyone can install on their own music collection. Given a seed song the software generates a playlist of similar songs. In addition, the software sends something like a fingerprint to C4DM's servers which they plan to use to generate recommendations. Mark writes: "and we're not saying what science we do because we'd like to see if you can guess." In terms of similarity I guess they are using Mark Levy's recent work. It's a fast and memory efficient approach and performs almost as good as some computationally much more expensive approaches. I'd love to give it a try, but unfortunately Soundbite currently only runs on Mac OS X.

Chris Cannem from the C4DM recently announced the release of Sonic Visualizer 1.0 which is a nice tool to visualize music (and has partially been developed within the legendary SIMAC project). The plugins include interesting algorithms such as: beat tracking, chromagram, constant Q spectrogram, tonal change detector, key estimation, onset detection, pitch tracking, note tracking, and tempo tracking.

Douglas Turnbull (also known for the MIR related and fun Listen Game) has announced that his team is making some of the data they collected available. In particular, they are offering the "CAL-500 data set of 1700 human-generated musical annotations that describe 500 popular western musical tracks". I haven't seen the data yet, but I'm sure there's lots of interesting things you could do with it.

And last, but not least: There is a new MIR related journal: the Journal of Interdisciplinary Music Studies (JIMS). And guess where it is being published? Izmir :-) And the articles even got Turkish abstracts!

MIR Research at Last.fm

I’ve joined Last.fm’s last week. Last.fm has over 20 million users and is all about music: listening to music, discovering music, keeping track of your own listening history, showing other people what you listen to, finding other people who listen to similar music, finding out and exploring what friends are listening to, keeping track of events related to music (such as concerts), helping artists find and get in touch with their audience, watching music videos, and more. Last.fm’s team is continuously working on making it easier and more fun to enjoy and discover music. Not even a power outage can stop us ;-)

From an MIR researcher’s perspective there are two particularly notable things about Last.fm. First of all, we got the best user community which provides valuable feedback helping develop new services and improve existing ones. Evaluating MIR technologies has never been easier. Second, we’re constantly receiving information about which tracks are playable in which context, and which words (tags) can be associated with which music. It’s a wonderful playground to explore interesting MIR related questions.

More MIR PhDs

Kyogu Lee recently announced the defense of his PhD on the music-ir list. The title of his work is "A System for Chord Transcription, Key Extraction, and Cadence Recognition from Audio Using Hidden Markov Models".

Fabien Gouyon pointed me to Alex Loscos’ thesis which was completed recently. The title of his work is "Spectral Processing of the Singing Voice" (abstract and PDF).

Fabien also pointed me to the thesis of Álvaro Barbosa which was completed last year (and which I hadn’t included in the list of MIR PhDs yet). The title of his work is "Computer-Supported Cooperative Work for Music Applications" (PDF).

The music technology group (MTG) in Barcelona is currently by far the most productive MIR research group in terms of completed PhDs. A MTG specific PhD list can be found here.

Thursday, 3 May 2007

Recent PhDs

Paul Brossier recently announced on the music-ir list that his PhD thesis is available for download (PDF). The topic of his thesis is "Automatic Annotation of Musical Audio for Interactive Applications" where he deals with implementations of algorithms that have very short latencies (i.e. they are very fast, and don't require the whole audio file as input to create useful output). Update: see also this post at nest @ The Echo Nest.

Pedro Cano recently successfully defended his thesis in Barcelona. The topic of his thesis is "Content-Based Audio Search from Fingerprinting to Semantic Audio Retrieval" (PDF). He covers a broad range of topics and discusses issues such as the semantic gap. Pedro is currently CTO of BMAT which is an amazing MIR related startup with an unbelievable amount of funding and excellent links to the impressive MIR research group at MTG.

The list of MIR PhDs has been updated.

Recent Events

The MIREX preparation meeting in Vienna organized by Stephen Downie was a big success. Having the chance to discus the various issues in person is definitely a much more efficient approach than the endless flood of emails we had last year. One of the most discussed topics was audio-based music similarity. Respective discussions ranged from the need for a more specific definition of the task to technical details of large similarity matrices. Andreas Rauber’s team and in particular Thomas Lidy did a great job hosting the meeting. MIREX was also mentioned on the Boing Boing blog. Below is a picture of the participants taken by Rainer Typke. More of his pictures can be found here.

Almost at the same time the ICASSP conference took place in Hawaii which was also well attended by MIR researchers. The next get together seems to be the AES122 workshop on Music 2.0 in Vienna which is organized by Oscar Celma and Mark Sandler. (Update: see also this post by Paul Lamere.)
Btw, 214 papers were submitted to ISMIR this year, which is a 17% increase of the previous record which was set last year (183 submissions).

One a side note, Paul Lamere somehow already guessed it: I’m joining last.fm. I’m starting on Monday will try my best to support the social music revolution :-)

MIR Research