TransRecG: Transformer-Based Recommendation System using Graph Embeddings
How can we use a temporal information in Recommendation System?
Motivation
As a project for CSE 6240: Web Search & Text Mining in GeorgiaTech, we aimed to enhance performance of sequential recommendation task using: temporal information, and higher order connectivity features like user-user and item-item relations.
Two baselines we compare with our model are: Behavioral Sequence Transformer (BST) and Neural Graph Collaborative Filtering (NGCF).
- Limitation of BST: not capturing higher order user-user or item-item relations
- Limitation of NGCF: does not consider the temporal behavior of the user, nor does it learn the underlying user-user or item-item relationships.
To overcome such limitation, we built a recommendation system that uses BST and NGCF on a User-Movie Knowledge Graph.
Method
How Does Our New Model Work?:
- BST captures the sequential information to understand the temporal behavior of the user.
- NGCF model on the Movie-User KG captures the user-user, movie-movie and user-movie relations.
- Augmenting the Transformer with the RGCN embeddings from the KG helps to leverage temporal and spatial information to make better and personalized recommendations
Along with evaluating the model based on rating prediction task, we also explored the ability of the models to handle the user cold start problem, and analyzed the tradeoff between number of training samples and loss.
Findings
Findings from the Experiments:
- 1. Augmenting BST models with graph embeddings improves the performance of the model. Graph embeddings from RGCN + KG performs better than GCN + BP models, as initial embeddings are more meaningful (GloVE).
- 2. For the cold start problem, our model with Knowledge Graph performed the best.
- 3. As number of training sample increased, the performance of BST models augmented with graph embeddings almost remained unchanged.
More experiment results can be found from the report attached below.
Future Work
We want to experiment with different sequence length of BST model. Also, we implement the Knowledge Graph (KG) as a heterogeneous graph only in terms of the edges. Creating the KG as a fully heterogeneous graph with both different types of nodes and edges, is a relevant future work as user-movie-attribute graphs are composed of different types of entities in nature.
More Information
For more detailed information, please look at the followings:
Report