3 Machine Learning Models Spotify uses to recommend music you’ll like

Reading Time: 3 minutes

In the early 2000s, Songza implemented a manual music recommendation system for its listeners, where a team of music experts and curators would create playlists. But these recommendations were not objective, as they were dependent on the personal taste of the curators.

It was an average experience for listeners, with a fair share of hits and misses, because it was impossible to make a playlist which catered to the varied tastes of a diverse set of people. The technology and the data did not exist back then to build a playlist that would be personalised to the taste of each individual listener.

Along came Spotify a few years later, offering a highly personalised weekly playlist called Discover Weekly that quickly became one of their flagship offerings.

Every Monday, millions of listeners receive a fresh playlist of new song recommendations, customised to their personal tastes based on their listening history and the songs they’ve engaged with. Spotify uses a combination of different data aggregation and sorting methods to create their unique and powerful recommendation model that’s powered by machine learning.

One of our flagship features is called Discover Weekly. Every Monday, we give you a list of 50 tracks that you haven’t heard before that we think you’re going to like. The ML engine that’s the main basis of it, and it’s advanced some since, had actually been around at Spotify a bit before Discover Weekly was there, just powering our Discover page” – David Murgatroyd, Machine Learning Leader at Spotify.

Spotify uses three forms of recommendation models to power Discover Weekly.

1. Collaborative Filtering

Collaborative Filtering is a popular technique used by recommender systems to make automated predictions about the preferences of users, based on the preference of other similar users.

On Spotify, the collaborative filtering algorithm compares multiple user-created playlists that have the songs that users have listened to. The algorithm then combs those playlists to look at other songs that appear in the playlists and recommends those songs.

This framework executed by matrix math in Python libraries. The algorithm first creates a matrix of all the active users and songs. The Python library then runs a series of complex factorisation formulae on the matrix. The end result is two separate vectors, where X is the user vector representing the taste of an individual user. Vector Y represents the profile of a single song. To find out users with similar taste, collaborative filtering will compare a given user vector with each and every single user vectors to give a similar user vector as the output. The same procedure is applied to the song vectors.

Spotify does not only rely on collaborative filtering. The second recommendation model used is NLP.

2. Natural Language Processing

NLP is the ability of an algorithm to understand speech and text in real-time. Spotify’s NLP constantly trawls the web to find articles, blog posts, or any other text about music, to come up with a profile for each song.

With all this scraped data, the NLP algorithm can classify songs based on the kind of language used to describe them and can match them with other songs that are discussed in the same vein. Artists and songs are assigned to classifying keywords based on the data, and each term has a certain weight assigned to them. Similar to collaborative filtering, a vector representation of the song is created, and that’s used to suggest similar songs.

3. Convolutional Neural Networks

Convolutional Neural Networks are used to hone the recommendation system and to increase accuracy because less-popular songs might be neglected by the other models. The CNN model ensures that obscure and new songs are considered.

The CNN model is most popularly used for facial recognition, and Spotify has configured the same model for audio files. Each song is converted into a raw audio file as a waveform. These waveforms are processed by the CNN and is assigned key parameters such as beats per minute, loudness, major/minor key and so on. Spotify then tries to match similar songs that have the same parameters as the songs their listeners like listening to.

With these key machine learning models, Spotify is able to tailor a unique playlist of music that surprises its listeners every week with songs they would have never found otherwise.

A key problem in many machine learning models is the lack of access to clean, structured data that can be processed. Spotify has been able to circumvent that problem due to their access to massive amounts of data that they collect from their users. They’ve been able to shine as a great example of effective use of Machine Learning models to give their users an unrivalled personalised experience.

Public Vs Private Vs Hybrid: Which Cloud Models do Companies Prefer?

Reading Time: 3 minutes

Organisations of all sizes have come to rely on the cloud to manage their data infrastructure securely at lower costs. The basic structure of cloud computing is based on the public cloud and private cloud.

Organisations use cloud services to manage, store or deliver their data. Deciding which service to use needs to be preceded by a proper strategy and planning about choosing between public and private cloud setups.

Public Cloud

Public cloud is the most popular method to deploy cloud services. A public cloud service is owned and operated by a third party service provider, who will take care of the maintenance of the cloud services and infrastructure. Public cloud services are delivered over the internet and are ideal for small to mid-sized companies. The most popular examples of public cloud are Microsoft Azure, Amazon EC2, and IBM’s Blue Cloud.

All the infrastructure including the hardware and the software is owned by the service provider and is shared by multiple organizations who are called cloud tenants. Public cloud services follow the pay-as-you-go model, which makes public clouds economical for organisations with varying needs. That factor, in addition to being able to handle smaller amounts of data, makes it ideal for small and mid-sized companies. Since the tenants do not own the services first hand, the pain of maintenance and management of the data centres in offset on the service provider.

Public clouds are used when data compliance and control over data is not a major concern for the customer. The major drawback a lot of organizations feel is lack of security and control over the hardware. As the servers are shared and the provider owns the maintenance rights, compliance regulation also becomes a concern.

Private Cloud

Private clouds are owned and operated by a single organization or entity. In a private cloud environment, the hardware, software, and related infrastructure is either located at the data-centre of the organization or is located in a controlled environment of a service provider. Private clouds differ from public cloud in their flexibility and control over the data. Private clouds by definition cannot be provided as a service.

Government institutions, financial institutions like banks, mid to large-sized companies, and any other organization dealing with sensitive information tend to prefer private clouds. The private cloud has a dedicated service provider, so it offers complete control over the data, enhanced flexibility, scalability, automation, security, and it all comes with a price.

Although private cloud offers scalability and security, it is expensive to set up and companies will have to constantly maintain the servers and do their own troubleshooting.

In real-world practice, cloud computing services are also offered in another format known as the Hybrid cloud which tries to deliver the best of both worlds. It incorporates the benefits of both public and private cloud.

Hybrid Cloud

Hybrid Cloud is a cloud computing environment incorporating both private and public cloud services with a coherent synchronization.

In a typical Hybrid cloud, the data can be switched between the on-premise and third-party service provider. This provides enhanced control, flexibility, and cost saving. Hybrid cloud helps organizations to handle the short-term hike in demands with minimum capital.

According to Forrester Research’s principal analyst Dave Bartoletti, “It (hybrid cloud) lets you pick the right cloud for the right workload, it doesn’t artificially limit you.”

Although hybrid cloud offers a gamut of advantages, the major concern still revolves around the security of the data, which plagues the perception of public clouds as well. As the hybrid cloud is a blend of both, transmitting sensitive information over a network which is subjected to a third-party interference is an uncalculated risk for most of the organizations.

If you’d like to learn more about cloud computing, take a look at Great Learning’s comprehensive PGP-Cloud Computing program.