SoundAnchoring

An interface for organizing, visualizing and exploring music collection that groups tracks according to acoustic similarity.

The challenge

How do you organize and browse your music collection? Do you use applications such as iTunes, Windows Media Player or Winamp? These text-based applications allow you to interact with your collection using metadata (artist, album or track name, duration, release year, genre etc). While text-based interfaces perform really well if you know exactly what you want to listen to, e.g., “Eine Kleine Nachtmusik” by Mozart, they will not help you much if you do not have a specific track in mind, want to listen to tracks similar to a given one or explore a music collection serendipitously. Building a playlist for a certain occasion, e.g., commuting, working or studying, using a text-based interface can be both tedious and time-consuming.

A solution

In order to address these issues, I developed SoundAnchoring, a content-based interface for iPads that maps a music colletion into a two-dimensional grid. Similar tracks are close to each other on the grid, whereas dissimilar ones are farther apart.

SoundAnchoring interface
SoundAnchoring maps the music collection to a grid. Users personalize the grid by choosing the location of clusters containing similar music and the colour-genre mappings, and using genres to filter which tracks will be displayed and interacted with.

In this interface, users can build playlists by tapping on one of the squares of the grid, which causes a list of tracks mapped to that square to be shown, and select the desired track. Alternatively, users can move one finger over the grid to add multiple tracks to the playlist (sketching gesture).

The Self-Organizing Map (SOM) algorithm is commonly used in content-based interfaces to group tracks according to acoustic similarity. The original algorithm, however, does not allow users to determine the positions of clusters containing similar tracks on the grid. I devised a variation of the SOM algorithm in which users choose the locations of anchor tracks on the grid. Acoustically similar tracks are positioned in the neighbourhoods of each anchor by the novel algorithm.

In order to determine which algorithm could evoke better user experiences, I designed and conducted an evaluation with user participation. Please keep reading to learn more.

Evaluation

In the evaluation, twenty-one participants performed tasks in two systems: the control system (CS), which features the original SOM algorithm, and the proposed system (PS), which is based on the novel algorithm, loaded in two iPads. The visual elements in both systems were identical as the goal was to evaluate the impact of the algorithms on the user experience. 

Participants were recruited in the university community by invitations spread throughout the campus buildings. Although no financial compensation was offered, individuals were enthusiastic about interacting with a new interface for music exploration. 

Evaluation outline

The evaluation took place in a prepared office room. Even though headphones were available, participants could bring and use their own equipment.

After reading and signing the user research consent form, each individual answered a questionnaire regarding their background, music habits, and experience with applications for audio collection exploration and touch-based devices.

Later, participants were randomly assigned to start working with either PS or CS to compensate for order effects. Participants had to perform tasks on each system. After interacting with each system, subjects rated eighteen statements concerning control, sensory and distraction factors using a 6-point scale. Two versions of the set of statements were used for each subject to minimize the acquiescence bias. Subjects wrote about their impressions of each system as well. All user interactions were logged.

Tasks

The first task was conceived to raise awareness of the mapping of acoustically similar tracks to the same square or neighbouring squares of the grid. Moreover, the said task helped participants get acquainted with the The 700-track dataset used in the study. Participants were required to tap on one square of the grid, listen to the tracks of that square and tracks from adjacent squares. These steps had to be repeated for two other squares, distant from the first square and from each other. 

The second task entailed the creation of the playlist. Slips of paper containing scenarios listed below were placed face-down:

  • working in the office
  • jogging
  • romantic dinner
  • working out at the gym
  • celebrating the end of the term
  • driving home after work
  • cleaning the house
  • car trip to favourite destination
  • riding the bus to school to take the final exam
  • relaxing at home
A scenario sets the stage for the task and motivates the participant. Participants were asked to pick one slip of paper and build a playlist of at least 30 minutes containing a minimum of 3 genres that would be suitable for the scenario described. 

Data collection and analysis

Data comprised subjective and objetive measures. Subjective measures correspond to statement ratings, whereas objective measures refer to logged user interactions.

By analyzing metrics related to the use of the sketching gesture to build playlists (tracks added to the playlist via sketching and later removed, tracks added to the playlist via sketching and the ratios between the aforementioned metrics), it is possible to infer that tracks added to the playlist via sketching on PS were more suitable than tracks added through the same gesture on CS. Therefore, the proposed system helped individuals create playlists more effectively when the sketching gesture was used.

PS outperformed CS in the issues addressed by the following statements:

  • It was possible to get involved in the experiment to the extent of losing track of time
  • I felt proficient in interacting with the interface at the end of the experiment
  • Getting the system to do what I wanted was easy,
which suggests PS was perceived as more engrossing and easier to control than CS.

Please refer to Personalizing self-organizing music spaces with anchors: design and evaluation, published on Springer Multimedia Tools and Applications for further info.