Towards A Domain-Customized Automated Machine Learning Framework For Networks and Systems
Clouds gather a vast volume of telemetry from their networked systems which contain valuable information that can help solve many of the problems that continue to plague them. However, it is hard to extract useful information from such raw data. Machine Learning (ML) models are useful tools that enable operators to either leverage this data to solve such problems or develop intuition about whether/how they can be solved. Building practical ML models is time-consuming and requires experts in both ML and networked systems to tailor the model to the system/network (a.k.a "domain-customize" it). The number of applications we deploy exacerbates the problem. The speed with which our systems evolve and with which new monitoring systems are deployed (deprecated) means these models often need to be adapted to keep up. Today, the lack of individuals with both sets of expertise is becoming one of the bottlenecks for adopting ML in cloud operations. This paper argues it is possible to build a domain-customized automated ML framework for networked systems that can help save valuable operator time and effort.
READ FULL TEXT