Tuesday, 10 April 2007


Last week I was lucky to have the chance to listen to invited talks by Sebastian Streich and Bee Suan Ong (both currently working at Yamaha in Japan). Bee Suan talked about some of her work on the structural analysis of music. I wouldn't be surprised if her approach would have easily scored highest in last year's MIREX cover song identification task. Unfortunately she was too busy to participate (finishing her PhD, getting married, moving to Japan, ...). Also Sebastian's talk was very interesting. Generally, complexity in music is a very interesting topic. He wrote a nice discussion on it in chapter 2 of his thesis. The one thing I found most interesting from his talk was the following figure (page 90 in his thesis):

Basically it shows how his danceability feature that he extracts from audio correlates with descriptors you find on AMG. I find the correlations rather impressive. The curves are very smooth (although he does not display any measure for the variance this indicates that the variance is low enough), and the correlations are meaningful. For example, party music is more danceable than romantic music. I think his graph is a very intuitive way to communicate what his extracted feature does and how it relates to concepts that we use when talking about music. Personally I find it magnitudes more useful than an evaluation showing that if you include a certain feature in a black box genre/mood/style/whatever classifier, then the performance increases by 5%.

No comments: