Personalized Recommendation Systems using Two Tower Neural Nets

Vinay Bhupalam Muralidhar
4 min readJan 30, 2022

Recommendation engines provide personalized suggestions of products/services based on users’ preferences and learning from past behaviors. Most of us would have experienced recommendations from various applications in our day-to-day life. Eg: Netflix recommends movies, YouTube recommends videos, Amazon suggests products we are most likely to buy, etc. These recommendations provide personalized user experiences. In the case of eCommerce platforms, they have become an effective means of selling more products.

There are a plethora of algorithms ranging from Classical ML Techniques, Deep Learning, Reinforcement learning, Graph Neural Networks, etc for building a recommendation engine. In this article, the following topics would be discussed.

  • A basic two-tower Neural Network Architecture.
  • How eCommerce platform eBay is using the 2 tower neural network.

Two Tower Neural Network

The users and items are represented as N-dimensional embedding vectors, these are learned by the model such that the similarity score between a user and item representation is higher for items with which the user has interacted. The name two towers is derived from the fact that there are 2 towers one for learning the encoding of the users and the other for learning the encodings of the items. Two Tower Neural Network is a collaborative filtering approach. Collaborative filtering algorithms provide recommendations considering both user and item similarities E.g: If User A has liked Product A. User B and User A are found to be exhibiting similar patterns, then Product A is recommended to User B. Deep Neural Networks are used to learn the vector representations of both users and products based on the past interactions.

User Item Interactions of a movie recommendation system

It is possible to incorporate metadata information of the user and items into the two-tower neural network. Let us take an example of a movie recommendation engine, the metadata information for the user could be the following.

  • The current context of the user (Date, time, etc.)
  • History of items watched by the user and their timestamps. (Day, Month, time, etc)
  • Languages preferred by the user, etc.

For the movies, the metadata could be the following.

  • The title, description of the movie.
  • Other metadata like the language, publisher etc.
Two Tower Neural Network (Reference : Linkedin Post titled “Personalized recommendations — IV (two tower models for retrieval)

First, the embeddings of the metadata information are computed (these embeddings can be made learnable), then the embeddings of user and item are computed by passing the metadata information to the two towers. The neural network is optimized in such a way that the dot product of user embeddings and item embeddings are higher for user purchased items and lower for not purchased items.

Various challenges exist for a recommendation engine in a practical scenario like cold start problem i.e How to provide recommendations when new users and new products are added. How to provide recommendations from billions of products etc. The machine learning community has come up with various techniques to tackle these problems.

Recommendation Engine used at eBay.

Two Tower Architecture used at eBay (Reference: Research paper titled “Personalized embedding based e-commerce Recommendations at eBay”)

Item Embedding:

In the eBay marketplace, an item corresponds to a listing of something for sale from a seller. The metadata used for items is as follows.

  • Title — Name of the item.
  • Aspect -Description of the item
  • Category — The category of the item.

Title and Aspect information are described in Natural language. The raw text data is tokenized, and embedding is initialized with random values and then learned during training. The embedding vectors for categories are also learned in a similar way. All the item feature representations are concatenated and passed through an MLP layer to generate item representations.

User Embedding:

A user’s activity on an e-commerce marketplace is not only limited to only viewing items. A user may also perform actions such as making a search query, adding an item to their shopping cart, adding an item to their watch list, and so on. These actions provide valuable signals for the generation of personalized recommendations. Recurrent Neural network GRU is used to encode the ordering information of historical events.

For further information on model training, experiments, and deployment setup, please refer to the research paper from eBay.

References

Personalized Embedding-based e-Commerce Recommendations at eBay (arxiv.org)

https://www.linkedin.com/pulse/personalized-recommendations-iv-two-tower-models-gaurav-chakravorty/

--

--