MIR Research: September 2007

Tuesday, 25 September 2007

ISMIR Highlights

ISMIR is only halfway through and I can’t believe how many interesting things I’ve already missed. I guess it’s unavoidable given parallel sessions and so many poster presentations in limited time. Nevertheless, my brain is already overflowing and I feel burnt out. In fact there were so many interesting presentations, that I don’t have enough time to write all of them down (although it would be a great way to remember them).

One thing that might have a very high impact is that IMIRSEL is just about to launch their online evaluation system. Researchers will be able to submit their newest algorithms and find out how well they do compared to others. I think having an evaluation system like that is actually worth a lot to the whole community, and I can well imagine that (once all issues are solved) research labs will have something like a paid subscription which allows them to use a certain amount of CPU time on IMIRSELs clusters. However, to be truly successful they’d need to be 100% neutral and transparent. (Which I think means they shouldn’t have IMIRSEL show up in their rankings, and they should clarify how One Llama is linked to IMIRSEL.)

I also liked the poster Skowronek, McKinney, and Van de Par presented (“A Demonstrator for Automatic Music Mood Estimation”). They allowed me to test their system with one of my own MP3s which I had on my USB drive (I used a song from Roberta Sá) and it did really well. Another demo I liked a lot was the system Peter Knees presented (“Search & Select – Intuitively Retrieving Music from Large Collections”). Unfortunately I was asked to leave after I had been playing around with the demo for a bit too long, I guess. Ohishi, Goto, Itou, and Takeda (“A Stochastic Representation of the Dynamics of Sung Melody”) showed me some videos which I thought were simply amazing. Apparently it isn’t hard to compute them (once you know how to extract the F0 curve), but I’ve never seen the characteristics of a singing voice visualized that way. The demo of Eck, Bertin-Mahieux, and Lamere (“Autotagging Music Using Supervised Machine Learning”) was really impressive too… and it was interesting to learn that Ellis (“Classifying Music Audio with Timbral and Chroma Features”) found ways to use chroma information to increase artist identification performances. (And his Matlab source code is available!!) I once worked on a similar problem, but never got that far. Btw, it seems that chroma is everywhere now :-)
I was also happy to see that Flexer’s poster (“A Closer Look on Artist Filters for Musical Genre Classification”) was receiving a lot of attention. I liked his conclusions. There were also lots of interesting papers in the last two days. For example, I liked the paper presented by Cunningham, Bainbridge, and McKay (“Finding New Music: A Diary Study of Everyday Encounters with Novel Songs”). I particularly liked their discussion on how nice it would be to have a “scrobble everywhere” device that keeps track of everything I ever hear (including ring tones).

Beatles Chord Transcriptions

Chris Harte from the C4DM announced last night that he completed his amazing effort of transcribing the chords for all songs on the 12 studio albums of the Beatles. The transcriptions are extremely high quality. Anyone who wants a copy just needs to contact him. This will definitely boost research in any direction related to chords (chord recognition, chord progressions, harmony analysis...). It's also a good excuse for any research lab to buy the complete Beatles collection. Btw, don't forget to cite his work when you use his annotations! ;-)

Below are excerpts from the two emails he sent to the music-ir mailing list, so that Google can index them (Afaik he hasn’t set up a website for this yet).

(Chris' email is christopher dot harte at elec dot qmul dot ac dot uk).

Chris Harte wrote in his first email (Sep 24, 2007, 9pm):

[...] I have just completed work on the full set of chord transcriptions for the beatles songs from all 12 studio albums.

The verification was done by synthesizing the transcriptions in MIDI then putting that back together with the original audio (with correct timing and tuning) so that people could spot any errors by listening through to them.

Hopefully, after the verification process that we have just completed, these transcriptions should now be accurate enough to serve as a ground truth for various kinds of chord and harmony work in the MIR field.

If you would like a copy of the new version of the collection then please let me know and I will send them to you. [...]

Chris Harte wrote in his second email (Sep 25, 2007, 4am):

[...] To clear up a few things:

The transcription files are in wavesurfer ".lab" format which is just flat text arranged like this:

Start-time end-time label
Start-time end-time label
Start-time end-time label
...

Times are in seconds.

".lab" files can be opened as a transcription pane in wavesurfer (I have made a wavesurfer conf file set up for showing these transcriptions nicely if people need one) and also in Sonic Visualiser as an annotation layer (use "A point in time" for the "each row specifies" option when loading in sonic visualiser).

The chord symbols used in the transcriptions basically conform to the syntax described in our ISMIR 2005 paper "Symbolic Representation of Musical Chords: A Proposed Syntax for Text Annotations" available here:
http://ismir2005.ismir.net/proceedings/1080.pdf

There has been one slight change to the syntax described in this paper which is that now a chord symbol, which is defined as a root and a list of component degrees, should not automatically be assumed to include the given root note unless the '1' degree is explicitly included in the list
- e.g. C major can be written C or C:maj which are both equivalent to writing C:(1,3,5) so the "major" pattern should be (1,3,5) instead of just (3,5). This makes it possible to annotate a chord where it is obvious that the intended harmony is C major even though only the notes E and G are present by using C:(3,5). I hope that makes sense...

For those who do not already know, I have written a set of tools for manipulating these chord symbols in matlab (they don't use any toolkits so I guess they should also work fine in Octave) - if you would like a copy of those then let me know. There will be an updated version of these tools available soon as well.

For more information on the chord symbols, chord tools and transcription process, my long awaited (long awaited by me at any rate...) PhD thesis will include a whole chapter about it all. I hope to submit the thesis sometime around christmas this year. [...]

Sunday, 23 September 2007

One Llama, IMIRSEL, MIREX

One of the interesting things I learned in the recommendation tutorial today is that IMIRSEL launched a startup called One Llama. Seems like they have some ideas on how to make money with MIR technologies. I wonder how many of the MIREX participants were aware of this before submitting their latests implementations to IMIRSEL.

ISMIR Highlight: Recommendation Tutorial

Paul Lamere and Oscar Celma did a wonderful job presenting the recommendation tutorial. I wouldn't be surprised if this turns out to be my personal highlight of ISMIR 2007. They presented an overview of all the standard techniques used for recommendations, they talked about the typical (and unsolved) problems recommenders face, they had plenty of examples, and they also presented results from an evaluation of recommenders. The parts I personally liked best were the in depth analysis of tags and folksonomies, the part they called "novelty and relevance" (with interesting ideas on how to reach deeper into the long-tail), the analysis of artist similarity networks, and the evaluation of recommenders. They also made an interesting point about how nice it would be to have something like a Netflix competition for music recommendation. I'm guessing the slides of the tutorial will be online soon. I highly recommend having a look at them ;-)

I only attended the recommendation tutorial, but I've been told the other tutorials were also really well done. Seems like this year's ISMIR is not only the best in terms of number of papers submitted, number of people attending, best location ever, but also best content ever! ;-)

Btw, Paul's blogging about ISMIR in case you haven't noticed yet. And a number of pictures have already been uploaded to flickr tagged ismir2007.

Saturday, 22 September 2007

Fun things to do with fingerprinting

Erik Frey just posted some fun things he did using Last.fm's fingerprinter. For example, it's really easy to find out if an artist released "live" versions that are identical to the studio version except that some cheering has been added. In terms of false positives and fingerprinting he raises some interesting questions.

Friday, 21 September 2007

The most frequently cited ISMIR paper

I just did a quick Google scholar search to find the most frequently cited ISMIR paper. I'm not sure if I missed any, but it seems the most frequently cited paper is "Mel Frequency Cepstral Coefficients for Music Modeling" (PDF) presented in 2000 by Beth Logan. According to Google scholar it has been cited 127 times as of today. My coauthors and I have cited that paper several times :-)

MFCCs were originally developed in the speech processing community. Back in 2000 it wasn't obvious if the same techniques could just be "copied and pasted" to music information retrieval. MFCCs are now a very standard technique that are being used to compute music similarity, classify genres, identify instruments, segment music, ... In fact, today MFCCs are so common that they are often mentioned in ISMIR papers without citing a source.

Thursday, 20 September 2007

One evening and no testing

It’s been a long and busy day, and it’s taken me a while to go through the flood of emails that landed in my inbox today. Several of those were related to a singing microwave which seems to be at the height of its career.

Another interesting email I found in my inbox explains how the system that scored highest in several MIREX 2007 tasks was built:

“The system was not tuned - in fact it was not tested on any dataset (from the competition or otherwise) beyond making sure it was outputting feature values into its feature files and was in fact cobbled together in one evening.”

Something that has never been tested before, and sounds like a preliminary prototype outperformed them all. Since it hasn't been tweaked yet, the system probably has a very good potential to generalize, and probably can easily be tweaked to add at least another 1 or 2 percentage points accuracy to the genre classification results. That's pretty impressive.

Talking about MIREX I would like to add the following to clarify things I have written in a previous blog post:

I highly value MIREX, it's a driving force behind advances in MIR. I've personally learned a lot from it.

I understand that IMIRSEL has sacrificed a lot to make MIREX happen. It's been an amazing effort organized by Stephen Downie and his team.

I'm sorry my comments on the conflicts of interest issues have been perceived as personal attacks. That was not my intention.

I realize that my previous blog post on the topic should have clearly stated that: I'm fully (and always have been) convinced that no one at IMIRSEL had the intention to cheat. I have absolutely no doubts about that.

However, I'm still fully convinced that IMIRSEL submissions should not be listed in the same ranking as the submission of others.

Tuesday, 18 September 2007

Marsyas & Music Classification

One of the most interesting things I found in the MIREX results so far has been the good performance of George Tzanetakis in different categories. He scored highest in mood classification, and did well in the other classification tasks.

Since George is well known to have published some of the most frequently cited papers on music classification this isn't really interesting news. However, what's really interesting is that George did so well despite using Marsyas. Marsyas is open source and has been around in the MIR community for as long as I can remember. At the ISMIR 2004 and MIREX 2005 evaluations Marsyas didn't do too well (although, afaik it's always been by far the fastest implementation). Perhaps as a results, I've recently been seeing fewer papers on genre classification using Marsyas as baseline. But given the excellent performance this year, I think it's fair to say it has re-established itself as the baseline for any new music classification algorithm. In fact, it has done so well, that I doubt we will see any papers in the near future which can report significant gains compared to this solid baseline. (Btw, never forget to use an artist filter when evaluating genre classification performance!)

Overfitting and MIREX

IMIRSEL (the organizer of MIREX) hasn't officially responded yet to the conflicts of interest of organizing a non transparent evaluation and at the same time participating in it. What I've heard from others is that they don't see any problems with it.

Btw, has anyone else noticed that they won in every classification category where overfitting is a big issue? However, in a very related category (mood classification) where overfitting isn't an issue (thanks the a human component in the evaluation) they were outperformed by several others.

Furthermore, IMIRSEL never had their name put down on the list of potential candidates. Given the lack of transparency of the respective MIREX tasks I think this is something every participant should have known before submitting their work. Btw, so far it isn't even known who the researchers are who actually did the work. AFAIK, no entry so far in the history of ISMIR evaluations has been submitted without mentioning who the authors are.

Btw, as to now, IMIRSEL are the only ones in the genre classification task who haven't published an abstract (describing what their algorithm does and how it was optimized) yet. (They also haven't submitted one yet for the other tasks they won in.)

UPDATE:
Regarding anonymous MIREX submissions I just remembered that at the ISMIR 2004 evaluation hosted by MTG allowed anonymous submissions... and some authors did choose to do so. (However, as I already mentioned in the comments of this post: MTG clearly stated that they did not participate in the tasks they organized to avoid any conflict of interest.)

Monday, 17 September 2007

Vocaloid 2 is a big hit in Japan

Vocaloid 2 is Yamaha software that sings. The $100 software seems to be a big hit now in Japan. Read more about it here and here.

Congratulations to MTG who contributed largely to the development of Vocaloid! Seems like we're a lot closer now to having 5 billion new (anime) songs per week flood the MIR universe.

(Thanks Norman for the pointer!)

MIREX Results Online!

The MIREX results just got posted by Stephen Downie. Interestingly the organizers scored highest in a number of categories. To be honest, if I were a participant in a task like genre classification I’d be a bit suspicious. (Knowing the distribution of the genres before hand can be a huge advantage when designing an algorithm.)

Congratulations to Tim Pohle and Dominik Schnitzer (two very clever PhD students I once worked together with in Vienna) who scored highest in the audio similarity task. I wouldn’t be surprised if they also had one of the fastest implementations. Tim also scored second highest last year in the same task. And Dominik recently made the results of his Master’s thesis available (open source playlist generation).

Congratulations also to Joan Serrà and Emilia Gomez (a former SIMAC colleague) who scored highest in the cover song identification task.

And congratulations to everyone who participated and the organizers for managing to complete all the tasks before ISMIR!

Listening

The best thing about working in MIR research is that it’s part of the job to spend lots of time listening to music. Which makes me realize that I've been working very hard this weekend ;-)

I spent the last hours listening to music listening I found playing around with a recommender. It's one of those recommenders which takes one of my favorite tracks as input and returns a list of similar tracks. Seeing how amazing some of the recommendations are makes me wonder if I’ll ever again bother to browse lists of similar artists to find new music.

620, 10, 5

Michael Fingerhut announced today on the music-ir list that his complete list of ISMIR papers now contains 620 entries (including this year’s papers). That’s an impressive pile of papers that the ISMIR community has produced since 2000... Btw, there have been only 10 papers so far which contained “recommend” in the title. 5 of those will be presented this year... Given that there will also be a tutorial on recommendation, and that I've mostly been blogging about recommendations recently makes me wonder if recommendations is about to establish itself as one of the core topics of MIR?

Monday, 10 September 2007

Good Recommendations (2)

Inspired by Paul’s ongoing evaluation I tried my own tiny little evaluation.

As seed I used Le Volume Courbe. I recently stumbled upon Charlotte while browsing the Last.fm music profiles of friends.

I wanted to find more of the same (unfortunately she only recorded one album), and the most obvious place to start was the Last.fm similar artist list. There’s lots of good music there, but nothing that I enjoyed as much. I also browsed the top listener profiles, and profiles of people who commented on the Last.fm page for Le Volume Courbe. Again I found lots of great music, but not really more of the same.

Next obvious stop was Pandora, but they never heard of Le Volume Courbe before. So I tried iLike but they didn’t know about any similar artists and ZuKool couldn’t help either. MyStrands had a long list, but after sampling the first two on the list I had the impression that they are pointing me in the wrong direction (too much towards electronic music). Amazon had some interesting recommendations (first time I heard about shoegaze) but not really more of the same. And finally my flat mate recommended some great and related music, but also not really more of the same.

So my preliminary verdict is: either there isn't more of the same out there, or the music recommendation services I tried need to be improved.

UPDATE: I just had a look at the AMG similar artist list. There's some interesting recommendations there, some of which I had already stumbled upon while browsing similar artists on Last.fm, but still nothing that's truly more of the same.

UPDATE Part 2: I just tried the AMG Tapestry Demo suggested by Zac in the comments. It's a lot more convenient than browsing the AMG pages, and it's similar to Last.fm's similar artist radio stations (except that it's only 30 second previews). Nevertheless, there were some recommendations on the list that I appreciated (and hadn't found in the AMG list of similar artists). However, somehow the recommendations seem to be missing some of the "darkness" I like about Le Volume Courbe. Anyway, it's great to have so many nice ways of exploring similar artists.

UPDATE Part 3: I just sampled some of the artists on the list Paul posted in the comments. Whatever system he's using, it's doing a great job in surfacing very unknown artists, some of which are even hardly present on Last.fm, and most of which seem to be present on myspace (which makes me wonder if his recommendation machine is gathering information from there?). Again, I failed to find more of the same. However, a number of the recommendations were related (in particular, some were related with respect to the lo-fi, singer-songwriter, DIY aspects I like about Le Volume Courbe), and since none of the recommendations in his list had shown up in any of the previous recommendations I had seen (at least as far as I can remember) it was rather refreshing to hear them. It's really nice to see a recommendation machine that has such a strong emphasis on surfacing rather unknown artist.

UPDATE Part 4: Oscar suggested in the comments to try the hype machine, which I did. I found some interesting comments about Le Volume Courbe there, but didn't really find more of the same. I also tried the much hyped SeeqPod, but they not only failed to find music related to Le Volume Courbe, but also gave some not so trust worthy recommendations for Mozart. Others I tried and that didn't have any results were musicmatch and musicplasma.

UPDATE Part 5: Ian mentioned that ZuKool now has Le Volume Courbe in their catalog. I gave it a quick try by adding all the songs from the album into the list (because I didn't like how when I'd only choose one track all other tracks from the same album would show up in the recommendation list). The results were as refreshing as those from Paul's list (none of them I had previously seen in a recommendation list) and they were interesting to listen to. However, I couldn't find more of the same. Btw, getting Celeste Zepponi's "Jesus Is Here" recommended when searching for similar music to Le Volume Courbe suggests that ZuKool completely ignores socio-cultural information, which makes it an interesting alternative to all other music recommenders I use.

Saturday, 8 September 2007

Good Recommendations

Paul launched a very interesting survey on music recommendations. The results will be presented at their tutorial in 2 weeks at ISMIR and I'm sure presentation slides will be available online after that. I highly recommend participating :-)

Trying to answer the questions I realized how difficult it can be to recommend music given just one artist. Would it be good to recommend someone a rather unknown (= not so popular) artist when they are looking for something similar to an extremely popular artist (like The Beatles)? Or would it be better to recommend similar artists which are also very popular? Btw, in the case of The Beatles, would it really make sense to recommend John Lennon and other members of the group? And if someone is looking for music similar to a not so well known artist, would it make sense to recommend similar but popular artists? Or is it safe to assume that this person already knows these?

Evaluating recommendations is another very interesting topic… and I’m very curious what the outcomes of Paul’s survey will be.

MIR Research