Tuesday, 6 May 2008

Applying audio-based similarity

Applications of audio-based similarity still seem rather rare nowadays. Once in a while a startup claims that they're using it for recommendations, but looking at their results might suggest that they are just using metadata instead (for example, see this example which Paul recently blogged about).

Anyway, here is an example where 100% pure audio-similarity is being used: FM4 soundpark. The feature launched today. For anyone who doesn't read German: soundpark is the number one place for new Austrian artists to expose themselves online. Soundpark has been around for ages. Long before Myspace or similar sites became popular. Soundpark has a devoted community and is well integrated into one of the most popular radio stations in Austria (the FM4 station which reaches out to younger demographics favoring alternative/indie music).

However, attention is usually only paid to new releases and old ones can get buried quickly. To help users navigate and find content researchers from the legendary OFAI (including Martin Gasser and Arthur Flexer) in collaboration with Gerhard Widmer's department of computational perception (including Dominik Schnitzer) have helped FM4 integrate audio-similarity into their system. There is a bit more information on OFAI's page here and some more information in German can be found on orf.at.

To each song on Soundpark users will get 3 acoustically similar songs, and there seems to be a feature that allows creating a playlist by defining a starting and end song (but I haven't found that feature yet).

Soundpark hosts about 5000 artists with each a few tracks in average. (If I'm not mistaken they got about 8800 tracks.) They are constantly growing. According to the press releases they have a number of plans to add additional features to make it easier to navigate their content and discover interesting artists. Great news! Makes me especially happy knowing that it's former colleagues who are building this.


jeremy said...

Very cool! Any sense of how well it works?

And any idea what kind of acoustical similarities they are using? In my experience, there tend to be two different types of acoustically-motivated similarity: Raw audio acoustics, such as timbre and vocal signatures, and musicological acoustics, such as metric structure, rhythm, harmonies, etc.

Any idea which kind, or both, is being used?

Elias said...


I've only tried a few examples and the results seemed reasonable.

I don't know what exactly they use, but Martin said it's only timbre based and ignores tempo or rhythm information. I'm guessing it's similar in nature to last year's MIREX submission of Tim Pohle and Dominik Schnitzer for the similarity contest. They mention that it finds similarities beyond genre boarders.

At this url you can see 3 similar tracks it found to one of the songs from Wasabi Enterprise (an underrated Austrian guitar rock band).

The first one is from the same group, the other two (Seelenwaermer and Das Hawaii) are somewhat similar in sound.

Martin Gasser said...


currently we use timbre similarity based on MFCC statistics. However, we are thinking about improving the similarity measure by incorporating (a) other aspects of the audio signal (rhythm/tempo seems to be quite important for many Soundpark users) and (b) metadata and/or listening behavior.

It's well known that pure timbre based similarity conforms quite well with the notion of genre for some genres (i.e. HipHop, Metal), while it is more likely to fail for others (Jazz,...). We are curious about the Soundpark users' reception of this fact and how much importance they attach to the possibility of finding new music beyond genre borders (we assume that they will love it!). Results will be published.


P.S.: Elias, I think you mistyped the link to the underrated Austrian guitar rock band ;-)

jeremy said...

vuen dank, for the updates, Martin!