How Recommendation Systems Tackle the Cold Start Problem

Gabriella Vas Gabriella Vas
6 min read | October 18, 2021

Data fuels personalization engines, otherwise known as recommendation systems. In order to match the right users with the right items, the system needs information about both. What if one or the other type of data is not sufficient, or missing altogether?

In this case, the personalization engine encounters a cold start. It takes a bit of time or some smart moves to “warm it up”, before it’s producing relevant recommendations at the expected rate.

Read on to find out what causes the cold start problem; why it poses an obstacle for conventional recommendation algorithms; and how it can be overcome using deep learning and other advanced personalization features. 

The Cold Start Problem Explained

As an e-retailer, you might wonder: “About half of our web store’s visitors are unknown. First-time visitors, users who block cookies, who multi-device without logging in – we don’t know the first thing about them. How could we serve them useful recommendations?”  

If you have a large product portfolio, your concern might be, “Big brands, popular products are easy to sell, but what about the rest, the greater part of our inventory? How can we leverage personalization to get that stock rolling?” And also: “Unless we promote them, new items in the store are often slow to ‘take off’. Why is that?”

These insights all point to the cold start problem. 

Just like a car’s engine that has difficulty revving up in extremely cold weather, a recommendation system sputters when conditions are harsh. In the context of machine learning, this simply means not enough data.  

User Cold Start

When the system encounters new visitors to a website, with no browsing history or known preferences, creating a personalized experience for them becomes a challenge because the data normally used for generating recommendations is missing. This is the case of user or visitor cold start. 

It’s not just first-time users of a website that “confuse” recommendation systems. User cold start can occur even with known, returning visitors, if their behavior and preferences change from one session to the next. Classified sites and video sharing platforms typically face this problem. An example: For a while, a user might be searching for and comparing inflatable kayaks, but once he has bought the item he wanted, he’ll move on to something completely unrelated, like second-hand ukuleles. Because of his session-based behavior, his browsing history won’t provide useful clues for guessing his next choice.   

In a broader sense, a certain degree of user cold start will always persist, as long as online consumers keep exploring new topics and trends; as long as their lifestyle, their circumstances and their needs keep evolving.

Item Cold Start

When a new product is added to a webstore, or when a freshly minted piece of content is uploaded to a media platform, at first, nobody knows about it. With zero interactions or ratings, it is practically invisible to the recommendation system, no matter how relevant it would be to some – or many – users. This is called product or item cold start

Among those hardest hit by this phenomenon are classified sites and news platforms. Here, fresh items are typically the hottest but their value deteriorates quickly: yesterday’s breaking news is old hat today, and the vintage bike put up for sale last week has already been sold. 

But that’s not all. Item cold start also plagues marketplaces, where the same product offered by various sellers appears under different product ID’s. Because the personalization engine perceives these to be distinct elements, one product’s ratings and user interactions won’t inform the way its clones are recommended.  

On e-retail sites, item cold start affects so-called long tail products – the masses of “low-hype” goods of which only a few are sold each month, but which, through their sheer aggregated volume, still generate significant traffic. Because they’re not in demand, it takes a long time for these products to accumulate enough user interactions to appear on the recommendation system’s radar. This is why they are cold start items.  

Why Collaborative Filtering Fails at a Cold Start

Recommendation systems rely heavily on a category of algorithms called collaborative filtering. It’s a great tool for making cross-sell or upsell recommendations in the midst of a purchasing process. But at a cold start, collaborative filtering is of no use. Here’s why. 

According to the basic assumption of the collaborative filtering concept, like-minded people will like the same product. Therefore, collaborative filtering algorithms recommend each item (a product or a piece of content) based on user actions like views, ratings, or purchases. The more user actions an item has, the easier it is to tell who else would be interested in it, and what other items are similar to it. If there aren’t enough valuable user actions with a certain item to pave the way for accurate recommendations, the system won’t know when to display this item.

Fortunately, there’s more to personalization than collaborative filtering. To tackle the cold start problem, recommendation systems like Yusp have a few aces up their sleeves.  

Ways to Overcome the Cold Start Problem

When faced with visitor cold start, the personalization engine can fall back on geolocation data and contextual information (date, time, weather; the user’s device, operation system, and browser; what domain they arrive from, etc.) to generate recommendations, as these are available even in case of unknown visitors. Recommending trending products is another efficient way to “break the ice” with a cold start user. 

Further into the user journey, on product detail pages, item-to-item recommendations like “Frequently bought together” help to overcome the cold start problem. This is because the item-to-item logic relies on data from the interactions of previous, identified users, not the specific visitor the recommendation is being served to. 

On top of this, Yusp can differentiate between casual browsers and shoppers who know what they are looking for. When creating item-to-item recommendations, our system will focus on the actions of the shoppers and ignore the browsers. For example, if someone clicks on everything from phone cases to real estate within a short period of time, Yusp will assume they are only there for browsing and won’t use their click history for recommendations. 

In any case, user cold start is only a temporary issue. Because personalization engines like Yusp work in real time, they can evaluate a new visitor’s behavioral information already after the first few clicks, and use it to generate recommendations with increasing accuracy.

To address item cold start, we resort to content-based filtering. This means that recommendations of new products are based on their attributes (product details) for a certain period of time. In this initial phase, user actions are taken into account with less weight.

In order to strike the right balance in the exploration-exploitation trade-off, we aim for a combination of cold start item recommendations using content-based filtering, and standard product recommendations generated through collaborative filtering.

Using Deep Learning to Resolve Item Cold Start

A new approach to content-based filtering is using deep learning technology to get an even more precise picture of the item being recommended. Because deep learning algorithms process data in multiple layers and various formats, they can detect the slightest differences or similarities between products. This is especially useful for marketplaces, where, as mentioned earlier, identical products from different sellers can confuse conventional recommendation systems that run on collaborative filtering. 

GRU4RecCS, Yusp’s deep learning algorithm optimized for cold start items, has been successfully implemented by a major marketplace client. This algorithm can recommend similar products to a cold start item based only on its metadata. At the appropriate scale, GRU4Rec family algorithms can generate 10-15x ROI of the hardware cost. GRU4RecCS is now available as an additional deep learning module to the default Yusp recommendation system. Alternatively, you can purchase this deep learning algorithm package and integrate it with your existing personalization platform. To find out more about addressing the cold start problem with Yusp’s deep learning technology, don’t hesitate to get in touch with us .

What to read next

Join our newsletter

Get to know the ins and outs of personalization