Information Organization: A case study in music recommendations

Summary

I introduce “information organization”, an approach which I have been exploring for several years. As a case study, music recommendations should be organized, but existing applications currently organize music recommendations poorly. I discuss issues with current applications, and discuss features that address these issues.

Background

Information organization is basically a family of patterns for collecting, presenting, and navigating information that is structured and/or textual. These patterns draw upon ideas from NLP, ML, IR, UX, viz, and in general the patterns can be implemented as loosely-coupled components. The combination of the patterns leads to potent cross-interactions that add up more to the sum of the individual patterns. This will be more clear when we dive into the case study. These patterns are non-trivial to implement, and require some savvy in NLP or IR. For people knowledgeable in these arts, there is an opportunity to create differentiate your offering and creating unique value by implementing these patterns.

As a family of patterns, information organization is not tied to any particular application. Rather, it admits a variety of related applications that can build upon implemented patterns. In this case study, I’ll be talking about an application to organization music recommendations.

I have been for several years been developing this concept of information organization. Some of it has been implemented, but much of it has been designed but not yet implemented yet.

The problem

People enjoy sharing music recommendations, but lack an effective platform for sharing these recommendations. We are discussing here about sharing music preferences in numerical form (scores) and textual form (reviews). I ignore the question of sharing the audio itself.

Here are some current approaches:

Directly make a recommendation to a friend, online or off. Problem: Inherently transient and non-archival form of sharing. Also, there is no way for me to get recommendations from new people.
Listen to the radio, online or off. Problem: Not social. Inherently transient and non-archival form of sharing.
I write a blog article for myself or a larger publisher (e.g. Pitchfork) with my review of the music. Problem: Social features are limited. Discussion of particular songs is fragmented across publishers, which means that recommendations are not being effectively shared.
I share a Youtube link on Facebook, and discussion ensues. Problem: The discussion is circumscribed purely by my social circle, and I don’t have a mechanism for connecting with people outside my circle with whom I would nonetheless like to share music recommendations. Also, the historical archive is not accessible, and previous recommendations are not searchable.

So what we’re getting at is a music recommendation system that has social, archival, and recommendation features.

Approach

Here’s an application that could solve these problems. Click on the mockup image for a larger view of it.

Note that I am going to talk about many possible features for this application. If you are going to implement something, you should focus on core features. I talk about to variety of possible features to illustrate information organization patterns that are technically feasible but not yet commonplace.

At its core, there are two kinds of user activity:

Navigating recommendations.
Adding recommendations.

I’ll focus on navigation, since navigation suggests many of the most important features.

When navigating information, consider viewing the recommendations for a particular song. This page will contain different recommendations for the song. These recommendations will be summarized as expandable text snippets, and are presented in a ranked order. For example, if you have reviewed this song, your recommendation will rank at the top. Recommendations from your friends have higher rank, as do recommendations from people with similar taste to you. (Some users care more about the taste of their friends, and the ranking should reflect that. Some users care more about the taste of people with similar tastes, and for them the ranking should reflect that.) Less important, but nonetheless useful, is the objective “authority” of the source. For example, recommendations by well-respected critics like Pitchfork have higher rank than recommendations by unknown critics, if there is not enough social or personal information to rank the recommendations.

Another aspect of navigating is search. Search should implement auto-complete and auto-suggest. Content should be auto-tagged based upon existing music meta-data as well as reviews, so that searching for “bounce” will find all bounce tracks, even if the term “bounce” is not explicitly mentioned in any review of a particular track. Auto-tagging can be smoothed across different tracks by the same artist, as well as different tracks that have reviews that contain the same keywords.

Another aspect of navigating is finding related entities (entity = song, musician, genre, tag, etc.). Besides seeing popular songs by the same musician, it is also useful to see popular songs in the same genre, related tags, etc. Auto-tagging helps again here to figure out how related two entities are.

There is also the issue that there is no portable open data format for recording numerical preferences about some entity (AFAIK). Simply formalizing the exchange of preference information (not just for music, but any type of entity) would be a big deal.

Recap

We have discussed a handful of different components (ranking based upon social graph, navigating based upon related music, etc.). These features are non-trivial to implement, and require some NLP or IR savvy. The challenge in implementing these features poses an opportunity to those who can. The more features implemented, the more value is created based upon their interactions, so the application can phase shift to a higher echelon of quality. But there is clearly value in a music recommendation system that has only a partial feature set. Which features are the easiest to implement that create the most value upfront? I believe that social features, and integration with Facebook and/or Twitter can add a lot of value in terms of creating engagement.

What is the minimum viable product for this task?
Possible answers:

You log in with twitter, and type a review of a song or a band. This will be auto-tweeted, but also added to the recommendation page for this song or band.
Scrape web reviews and recommendations. Extract a summary text snippet for each. (Note that this is a non-trivial feature.) Aggregate these snippets on the site.

A benefit of the first approach is that it is inherently social. A benefit of the second approach is that it combats the “cold start” problem, i.e. it immediately populates the database with useful information.

I am curious what you think is a good minimum viable product. Is personalization of the view part of the core feature set?

About the author

I consult on data strategy (NLP, ML, business intelligence, etc.)
If you are interested in building out any of these ideas, get in touch with me and I can help. In particular, I can advise on how to build the ML + NLP components. I’ll help you get a practical prototype up-and-running really quick, and show you how to refine and improve the components as necessary. Part of the art in implementing information organization is identifying the components that add the most immediate value, and quickly implementing a solid baseline.