# Content-based recommendation
Main idea: Recommend items to customer $u$ similar to previous items $\mathscr{F}_{u}$ rated highly by $u$
Movie recommendation: Recommend movies with same actor(s), director,
genre, ...
Websites, blogs, news: Recommend other sites with "similar" content
## Item profiles
For each item, create an item profile $x_{i}$
Profile is a set (vector) of features
- Movies: author, title, actor, director,...
- Text: set of "important" words in document
How to pick important features?
- Usual heuristic from text mining is [[TF-IDF]] weight
Simple: (weighted) average of (positively) rated items profiles
$
x_{u}=\sum_{i \in \mathcal{F}_{u}} r_{u i} x_{i}
$
Variant: normalize weights using average rating of user
More sophisticated aggregations possible
Can also build classifiers/regressors to predict if a user likes an item.
Suggest items whose feature vector $x_{i}$ is most similar to profile vector $x_{u}$
Cosine Similarity/Minimum Description Length
## Advantages
- User independence - does not need information from other users (Collaborative Filtering requires this)
- Can handle unique tastes of users
- Unpopular items are alsa recommended
- Transparency - Explanations are straightforward
- Cold start (for items) is a non-issue as items are compared based on their content not on ratings
## Drawbacks
- Feature Engineering / Domain knowledge is often needed
- Often, content is not the only factor for users interacting with items
- Overspecialisation
- No serendipitous recommendations (unexpected items)
- Cold start - new users
---
## References