The role of diversity in recommender systems for public broadcasters


by Veronika EICKHOFF, BR on 27 Apr 2017

Based on:

  1. "Understanding the role of latent feature diversification on choice difficulty and satisfaction" by Martijn C. Willemsen, Mark P. Graus and Bart P. Knijnenburg.
  2. "Recommender Systems for Self-Actualization" by Bart P. Knijnenburg, Saadhika Sivakumar and Daricia Wilkinson.
  3. “Matrix Factorization Techniques for Recommender Systems” by Yehuda Koren, Robert Bell and Chris Volinsky.
  4. “Recommender Systems Handbook”, Section 8.3.8, by Guy Shani and Asela Gunawardana

Why to diversify recommendations

People like to choose from large sets, but large sets often contain similar items causing choice overload. Increasing diversity can lead to more attractive and satisfactory results even with smaller sets, thereby reducing choice difficulty.

To be precise, the goal is to diversify a set of recommended items while controlling for the overall quality of the set.

Tests of the diversification algorithm against traditional Top-N recommenders conducted in [1] show that diverse, small item sets are just as satisfying and less effortful to choose from than Top-N recommendations. While diversification might reduce the average predicted quality of recommendation lists, the increased diversity might still result in higher satisfaction because of the reduced difficulty. Additionally, relying on highest predicted relevance can result in ignoring other factors that influence user satisfaction. Experiments with user surveys show that despite the lower precision and recall of the diversified recommendations, diversification has a positive effect on users’ perception of the quality of item sets produced by the recommender algorithm.

Why diversity is especially important in recommendations for public broadcasters

As public broadcasters, our mission is to educate our audiences, to extend their potential areas of interest and to inform in a balanced way. Recommending content which suits a user's taste the most might lead to constraining them into a filter bubble, therefore failing on all three goals (an extreme example would be recommending only content confirming a person's existing political leaning).

Stated positively, diversifying recommendations would be a good first step towards conforming to our values, which we aim to confirm with user surveys.

In addition, while commercial organisations depend on user retention and might thus be more careful with offering not the most seemingly suitable content, we can afford and are even expected to do that.

While recommendation techniques are being employed in more and more online services, most research is concentrated on developing top-N style algorithms with very little emphasis being put to goals beyond accuracy. The authors of [2] propose the development of Recommender Systems for Self-Actualisation: “personalised systems that have the explicit goal to not just present users with the best possible items, but to support users in developing, exploring, and understanding their own unique tastes and preferences”. In particular, the authors give a sociological motivation for our goals as public broadcasters: “a deep understanding of one’s own tastes is important for cultural diversity—we want people to make lifestyle choices (e.g., music, movies and fashion) based on carefully developed personal tastes, rather than blindly followed recommendations”.

Diversification of matrix-factorisation-based collaborative filtering

The idea is to diversify item latent features*. Because latent feature diversification provides maximum control over item quality and item set variety on an individual level, it can increase the diversity (and thus reduce the choice difficulty) of an item set while maintaining perceived attractiveness and satisfaction.

*Latent features are the per-item output of matrix-factorisation-based collaborative filtering. They make it possible to relate items to each other along several abstract (latent) dimensions, which are supposed to approximate true characteristics of the item. See Figure 2 in [3] for a simplified illustration of latent factors approach.

Like classical collaborative filtering, the diversification algorithm also gives higher scores to content liked by users similar to the subject user (in case of implicit feedback, "watched the longest" implies "liked"). The difference is that the diverse algorithm selects a subset of the most different or distant items (see later) among the most high-scored ones, so the resulting set is at the same time diverse and still feels relevant.

The diversification algorithm can reuse the same model that matrix-factorisation-based collaborative filtering learned. Upon a recommendation request, the latent features of items and the user are retrieved from the model and a predicted rating of each item for the user is calculated. Then, in contrast to the classical collaborative algorithm which returns random/top N recommendations, the diversification algorithm finds the most diverse N items to be recommended. This is done by first selecting the top item from the initial recommendation set, then iteratively adding items into the selection based on their distance to the rest of the current selection.

The mathematical notion of diversity

The distance between items is defined using Manhattan distance:
d(a, b) = Sum(a_k - b_k), with a, b as items and k iterating over the latent features.
As argued in [1], this distance metric is the most suitable as it ensures that differences along different latent features are considered in an additive way, and large distances along one feature cannot be compensated by shorter distances along other features. This means that two items differing one unit along two dimensions are considered as different as two items differing two steps along one dimension. According to [1], this is more in line with how people perceive differences between choice alternatives with real attribute dimensions.

The diversity of a recommendation set X is defined as the average difference per feature i_k between the highest and lowest scoring items along that feature where i ∈ X, i_k is the score of item i on feature k, and D is the number of latent factors (dimensions):
Diversity(X) = Sum(k=1,D)((max(i_k) - min(i_k)) / D), i ∈ X.

This notion of diversity is called AFSR (Average Feature Score Range).

Experimental evaluation

The authors of [1] conducted several studies to understand how latent feature diversification affects user perception of recommendations, how diversity is related to choice difficulty and choice satisfaction, to measure perceived diversity and effort to choose items.

As mentioned in the first section, the authors got promising results, which motivates us to repeat the evaluation in our context.

Diversification algorithm as part of PEACH

The PEACH (Personalisation for EACH) project is a software solution empowering broadcasters and editorial teams to deliver personalised media services and experiences. The solution includes not only architecture required to collect, store and query data, but also a set of custom data science tools and recommendation algorithms. The diversification algorithm is implemented and offered as part of the platform.

To learn more about PEACH, visit peach.ebu.io and stay tuned as PEACH will be the subject of an upcoming article.


algorithms data science diversity media content recommendation personalisation public service media recommendations recommender systems

Comments

  • Veronika EICKHOFF