Recommender Systems

used in LINE Timeline

shared posts
like instagram feed

two systems

collaborative filtering
content filtering

collaborative filtering

to find users with similar tastes

a model

requirements

given a user and a post in a context
return a recommended post

ratings are considered explicit feedback

user likes and clicks are considered implicit feedback

from user view history log check if user has clicked the post after viewing. if the user has clicked, then it is a positive feedback. if the user has viewed but didn't click, it's a negative feedback.

content filtering

...

embeddings

also called feature matrices (columns and rows).

you can theoretically take feature matrices to train the model, but too many features make it really time consuming to train.

therefore you make embeddings. embeddings group things semantically os that they improve training performance.

overly sparse features give really useless embeddings, because there is too little information to capture (0.0001% of all data has information)

user embeddings vs post embeddings (if you do both separately, they are too sparse and therefore does not capture enough meaning in the embedding)

the solution used was to embed users as an embedding of user post interactions (user = function of that user's post interactions)

this embedding model worked much better

feedback loop problem

train model from raw user interaction
give recommendations to user
collect new user interaction
user interactions that are caused by model recommendations will bias the next recommendation toward the ones the previous model have given

previous model bias

when a model is trained and used for user recommendations

solution

exclude all interactions that have been recommended by previous model
exclude interactions that come from user subscribed posts (users already follow them)

therefore you only train the new model on user raw data that was not expected by the model

problems with a sole ranking model

computationally expensive
concentrated post distribution

recommendations are too generalized (everybody gets the same recommendation)

solution

candidate generation model
- two phases
- phase 1 - co-occurrence matrix to get post embeddings
- phase 2 - candidate generation training pgaes
- interaction history -> linear combination of history -> user vector -> nearest neighbor search -> candidates
ranking model

problems with the increasing size of the post pool

more and more posts over time (problems with every social network)

aligning embeddings trained in batches

model 0,1,2
model 0: trained on day 1-7
model 1: trained on day 2-8
model 2: trained on day 3-9

it would be nice to have a post pool of days 1-9

they want to align the multiple embedding models

this is called a "Orthogonal Procrustes Problem"

mapping one matrix to another matrix

get a pair of corresponding points
map matrix a to matrix b

posts have an embedding over time that overlap:

model 1-7 overlaps with model 2-8 on days 2-7
assume the angles in embeddings in matrix A and matrix B are similar
find the transformation matrix from some corresponding points between matrix a and b
apply transformation of the entire matrix b into matrix a, to get matrix a^