Background and Problem
Our client, a large music rights management organization, required a forecasting solution to automate the prediction and calculation of future song royalties based on historical performance and insights collected from external data. They wanted to invest in up-and-coming artists, and to that end, provided advancements based on expected future earnings. However, their current forecasts were crude, and they wanted a more sophisticated solution to automate advancement calculations. They also wanted to offer a value-added service, where artists/producers could log in to their existing customer dashboard solution to see forecasts of their expected revenues, and accordingly, request advances.
Adastra leveraged the client’s internal data, hit charts and song popularity information to develop a forecasting solution to predict future song earnings. Our team ingested close to 30 GB of data into the client’s forecasting database, including performance details of over 130 million records ranging from 2001 to 2016, the client’s accounts, and topsheet and catalogue summary data.
The model was initially developed within R, using Azure infrastructure, and integrating with the client’s SQL data warehouse. We later adapted it to use Python instead of R and integrated it with a Power BI dashboard for visualization.
The Adastra team focused on building a simple linear forecast based on historical earnings, leveraging Stochastic ARIMA (AutoRegressive Integrated Moving Average) model to forecast earnings for the subsequent year. However, since the client had nearly 100,000 artists and millions of records, we had to create a sophisticated parallelization for the forecasting to run efficiently.
We were able to aggregate earnings by work number and create a time series by quarter but found that data points were missing in many quarters. Our experts leveraged an interpolation algorithm to fill in missing historical data points. The in-scope data also had a lot of noise, and we used a Gaussian filter to reduce noise in the time series for a more accurate forecast.
There were some very large outliers in the datasets, primarily evidenced in concerts, and we had to make concessions for peaks (hit songs, etc.). Adastra’s team bucketed and clustered similar trends and used past information of specific artists to understand standard patterns (such as, exponential growth for first 3 quarters, or a plateau after a peak) and implemented rule-based logic in some cases where we had to account for standard patterns.
Our experts also had to implement some mathematical constraints to ensure positivity and to prevent exponential trends after the datasets went through logarithmic transformations and back-transformations. We also implemented rule-based logic to account for runaway measures/parameter fluctuations and ensured proper contingencies for risk in automated advancement calculations.
Since the initial model leveraged quarterly earnings only, it had some limitations. While it could predict general earning trends (due to smoothing), it had limited accuracy in predicting earning values for specific quarters.
In addition to our chosen linear forecasting approach, we also explored other options, including a deep learning approach, and tested various statistical models including Jordan neural networks, smoothing and stochastic-based methods. However, there was not enough historical data to support advanced predictions, and our simplistic mathematical approach was sufficient to meet the client’s needs.
Potential Next Steps
Additional Data Sources: Some of the initial data sources we had used were delayed or out-of-date by several months, especially in the case of external or other country data. In the second phase of the project, we started to look at some more up-to-date data sources, including Spotify, YouTube, and radio plays. We built a forecast based on daily stream counts and plays from these new sources and used that to augment quarterly earnings to get a more accurate measure.
The client had most of this data internally, but we also did some prototyping with the Spotify API, where we extracted genres, sales volume measures of similar songs, and other data points from Spotify. However, this part of the solution was exploratory and it yet to be implemented. We also considered adding other internal data sources, such as the BDS Nielsen data, to extract and validate genres.
Majority Voting approach: Adastra also tried a majority voting approach, for similar songs from the same genre which followed an existing performance pattern in terms of revenue. This is another solution that could be implemented to augment forecasts, especially for artists/songs with a short time series.
Audio Fingerprinting: Adastra built a pilot audio fingerprinting solution, where we did Gaussian-type filtering and spectrogram analysis to determine the most significant peaks and valleys in an audio to uniquely identify a song. The objective was to capture information from YouTube and other sources and use the solution to detect background music and determine which song it was to estimate accruing royalties. Since the solution leverages a spatial analysis of peaks/valleys, there were some challenges in accurately determining songs in situations where the audio was cut, slowed down, or speeded up. To circumvent this, we implemented some normalization measures to determine the general position, rather than the raw valuation of where a peak occurs. However, since the client had millions of songs, the amount of indexing required would have been monumental and would take years to develop. However, this could be a potential value-added service for the client in the future.
Detecting Up-and-Coming Artists: The client also wanted to potentially leverage some of the YouTube telemetry we captured for the purpose of detecting up-and-coming artists. We focused on songs that had registered a surge in volume in the past day, essentially capturing a sub-set of music that was rapidly gaining popularity. Since the music rights market is very competitive, the client could potentially use this telemetry to identify the next ‘big’ artist.
Our forecasting solution was embedded into the client’s customer dashboards as a value-add analytics service for their artists and producers. The client’s customers could now log in to their dashboards and visually see how their portfolio was expected to perform in the next four quarters. If needed, the client can also deploy a full-fledged automated “push-button” service, where an artist can request a pre-set percentage of their forecasted revenues as an advancement directly through the customer dashboard.
The solution was a significant improvement over their rudimentary advancement calculations approach, and they were able to reduce errors on advancement calculations by about $250,000 per quarter, savings millions over the years. The business guardrails and rule-based logic we put in place in our solution lowered their risk of giving out advances based on inflated estimated earnings.
Looking to enhance revenue forecasting or leverage AI for advanced audio fingerprinting use cases? Schedule a free consultation with our experts today!
Book a Free Consultation
We will contact you as soon as possible.