In late2020, the Salesforce application performance management team faced a challenge: they needed to improve anomaly detection algorithms.
The performance management team was constantly monitoring the health of Salesforce data centers, which provide a range of metrics in real time, including CPU usage for a particular service. These metrics indicate what is called time-series data.
Bhatnagar and his team have developed an open-source machine learning library called Merlion, which combines time-series analysis and machine learning. It was originally designed to address the difficult situations the Salesforce application performance team faced. Merlion, however, is an end-to-end Python library for many time-series tasks, according to Bhatnagar.
How Merlion Works and how time-series machine learning enables
In Palo Alto and Singapore, the Merlion project was begun as a collaboration between Salesforce research teams.
The Merlion is a mythical animal, half lion, half fish, who is also Singapore''s national animal, according to Bhatnagar.
Merlion, a mythical Merlion that the project is named after, is more than just a single thing. Merlion also includes capabilities for loading and processing data, as well as building and training a wide array of models that are unified under a common API, according to Bhatnager. The project also includes practices and steps for model outputs, as well as a framework to actually evaluate model performance.
After the Merlion project was completed, the Bhatnagars team quickly realizedSalesforces'' wide range of internal needs for time series machine learning. The original purpose for the project was anomaly detection for application performance management.
Aside from that, we also uncovered a lot of use for time series forecasting for a wide spectrum of tasks.
For predictive forecasting, in the IT industry, if there is a service that is consuming computational resources, like memory and CPU, time-series-based machine learning may be used. This forecast can help Salesforce to improve capacity.
From idea to production usage for Merlion
One thing is getting an idea for a machine learning library; another is that having a technology that really works in a production environment.
Bhatnagar said that his view is that integration into production environments is a common challenge with any machine learning library. That includes making the machine learning tool capable of processing the data as it needs, with the ability to access the required compute resources, and the ability to read the data back to appropriate places.
To manage this challenge, Bhatnagar said the Merlion project has added several default options, providing users with a good start point. The project continues to make the whole workflow simpler so operations can be more automated.
A fresh open-source standard for time-series analysis is proposed.
Merlion is not the first open-source project to attempt to alleviate the problem of time series analysis.
The Facebook-led Prophet project, which provides forecasting capabilities for time series data, was among the most popular. According to Bhatnagar, Prophet only has a subset of Merlion''s capabilities, including pre-processing, modeling, evaluation, and post-processing. This is why Salesforce decided to develop its own project and then open source it.
Merlion is a free-source project that can be used internally at Salesforce and by anyone who is looking for a time-series data machine learning analysis framework.
According to Bhatnagar, there was a lack of a standardized solution that would fulfill all of the criteria that people have for time series analysis in one place. So we figured that this would be enormously beneficial, not just to Salesforce, but also to other people who had time series difficulties.