Machine Learning Containers are Bloated and Vulnerable

12/16/2022
by   Huaifeng Zhang, et al.
0

Today's software is bloated leading to significant resource wastage. This bloat is prevalent across the entire software stack, from the operating system, all the way to software backends, frontends, and web-pages. In this paper, we study how prevalent bloat is in machine learning containers. We develop MMLB, a framework to analyze bloat in machine learning containers, measuring the amount of bloat that exists on the container and package levels. Our tool quantifies the sources of bloat and removes them. We integrate our tool with vulnerability analysis tools to measure how bloat affects container vulnerabilities. We experimentally study 15 machine learning containers from the official Tensorflow, Pytorch, and NVIDIA container registries under different tasks, (i.e., training, tuning, and serving). Our findings show that machine learning containers contain bloat encompassing up to 80% of the container size. We find that debloating machine learning containers speeds provisioning times by up to 3.7× and removes up to 98% of all vulnerabilities detected by vulnerability analysis tools such as Grype. Finally, we relate our results to the larger discussion about technical debt in machine learning systems.

READ FULL TEXT

page 3

page 11

research
08/13/2021

VulnEx: Exploring Open-Source Software Vulnerabilities in Large Development Organizations to Understand Risk Exposure

The prevalent usage of open-source software (OSS) has led to an increase...
research
08/27/2021

A Comparative Study of Vulnerability Reporting by Software Composition Analysis Tools

Background: Modern software uses many third-party libraries and framewor...
research
03/12/2022

Characterizing and Understanding Software Security Vulnerabilities in Machine Learning Libraries

The application of machine learning (ML) libraries has been tremendously...
research
09/18/2020

On the Threat of npm Vulnerable Dependencies in Node.js Applications

Software vulnerabilities have a large negative impact on the software sy...
research
01/11/2021

Understanding the Quality of Container Security Vulnerability Detection Tools

Virtualization enables information and communications technology industr...
research
08/11/2020

Code-based Vulnerability Detection in Node.js Applications: How far are we?

With one of the largest available collection of reusable packages, the J...
research
01/25/2023

Beware of the Unexpected: Bimodal Taint Analysis

Static analysis is a powerful tool for detecting security vulnerabilitie...

Please sign up or login with your details

Forgot password? Click here to reset