requirements
ratings are considered explicit feedback
user likes and clicks are considered implicit feedback
from user view history log check if user has clicked the post after viewing. if the user has clicked, then it is a positive feedback. if the user has viewed but didn't click, it's a negative feedback.
...
also called feature matrices (columns and rows).
you can theoretically take feature matrices to train the model, but too many features make it really time consuming to train.
therefore you make embeddings. embeddings group things semantically os that they improve training performance.
overly sparse features give really useless embeddings, because there is too little information to capture (0.0001% of all data has information)
user embeddings vs post embeddings (if you do both separately, they are too sparse and therefore does not capture enough meaning in the embedding)
the solution used was to embed users as an embedding of user post interactions (user = function of that user's post interactions)
this embedding model worked much better
when a model is trained and used for user recommendations
therefore you only train the new model on user raw data that was not expected by the model
recommendations are too generalized (everybody gets the same recommendation)
more and more posts over time (problems with every social network)
aligning embeddings trained in batches
it would be nice to have a post pool of days 1-9
they want to align the multiple embedding models
this is called a "Orthogonal Procrustes Problem"
posts have an embedding over time that overlap: