Spotify became known in the audio-streaming industry for its highly personalized user experience, utilizing artificial intelligence and a team of 9,800 employees at the end of 2022.
However, with three rounds of layoffs within a year - cutting 590 positions in January, 200 in June, and an additional 1,500 recently - Spotify's focus on using AI to improve margins for their podcasting and audiobook divisions appears to be a significant shift in strategy. Despite this, Wall Street remains optimistic about its potential success.
Spotify is utilizing AI technology on its platform, introducing an AI DJ for a traditional radio feel in 50 new markets and launching AI Voice Translation for podcasts," stated Justin Patterson, an equity research analyst at KeyBanc Capital Markets. "With the addition of audiobooks for Premium Subscribers, we see multiple chances for Spotify to increase user involvement and eventually improve monetization."
Shares of Spotify Technology SA's parent company have risen by over 30% in the past six months and by more than 135% year to date.
The company is now laying off employees, following a decrease in demand during the pandemic. It also needs to recover from spending over $1 billion on podcasting, including deals with celebrities that fell through and acquiring podcast studios that were ultimately closed. In a letter to staff posted on the company's website, CEO Ek acknowledged the economic slowdown and increased cost of capital, stating that Spotify is not immune to these challenges.
Jumping on the AI gravy train
Spotify announced in November a collaboration with Google Cloud to revamp the way it suggests audiobooks and podcasts by utilizing Google Cloud's language model, Vertex AI Search.
Large language models such as ChatGPT are trained to mimic human-like language and relay information to users based on their knowledge.
In February, Spotify launched an "AI DJ" and started utilizing OpenAI's "Whisper" voice translation tool to convert specific episodes of English podcasts into Spanish, French, and German.
A Spotify spokesperson informed CNN via email that the company intends to expand its technology in the future based on feedback from creators and audiences. During the company's third-quarter earnings call, CEO Ek emphasized the importance of "efficiency," mentioning it more than 20 times. He emphasized that AI initiatives are geared towards increasing engagement which in turn reduces churn, produces more value for consumers, and enables the company to successfully raise prices.
So how does personalization work?
Douglas Anmuth, the managing director and internet analyst at JP Morgan, highlighted the potential of podcast investments to drive long-term engagement, in addition to artist advertisement investments, in a recent research note.
For about ten years, Spotify has been customizing its user experience. This was made possible when it acquired The Echo Nest Corp, a music analytics firm, in 2014. By integrating machine learning and natural language processing, Spotify was able to add a personal touch to its service.
Spotify's technology creates a database of songs and artists by identifying musical pitches and tempos, and linking the works of artists within a common cultural framework.
Release date and metrics such as volume, duration, and danceability are factors in determining a user's music preferences. In turn, this information is used to create playlists like "Daily Mix" and "Discover Weekly." Additionally, "Time Capsules" and "On Repeat" playlists are curated based on a user's most-listened to songs, with the intention of either maintaining their current listening habits or reintroducing them to songs they may have forgotten about.
In an email to CNN, Anil Jain, the Global Managing Director of Strategic Consumer Industries at Google Cloud, stated that Vertex AI Search enables media and entertainment companies to develop content discovery capabilities for video, audio, images, and text. Jain declined to provide details about the agreement with Spotify.
Vertex AI Search takes into account various factors when suggesting content to users, including real-time user behavior, content similarity, and content related to users' search queries.
Challenges and opportunities
Reece Hayden, senior analyst at ABI Research, expressed confidence that large language models (LLMs) could work to increase engagement across Spotifys platform.
In an email to CNN, he stated that large language models have the ability to improve personalization and recommendations by analyzing entire text and videos, rather than just relying on keywords and metadata. He also noted that unlike basic predictive models that depend on keywords and metadata, LLMs can interpret podcasts to determine user interests and gain deeper insights into user preferences by analyzing all available user data.
However, this comes with a price.
According to him, utilizing LLMs to analyze all podcasts and audiobooks requires a significant amount of resources and may not provide as much value compared to simpler predictive models. In addition, LLMs present additional challenges related to data privacy and cost/resources, which will be substantial.
He had confidence in Whisper's ability to translate podcasts, acknowledging the possibility of errors as the generative AI continues to learn. "With access to a multitude of data points, language translation models like Whisper will rapidly enhance their accuracy," he stated. "However, the limitation of Whisper lies in its primary focus on translating from non-English languages to English ⦠This means it may not be as effective for podcasts recorded in English."