Charliecloud's layer-free, Git-based container build cache

08/31/2023
by   Reid Priedhorsky, et al.
0

A popular approach to deploying scientific applications in high performance computing (HPC) is Linux containers, which package an application and all its dependencies as a single unit. This image is built by interpreting instructions in a machine-readable recipe, which is faster with a build cache that stores instruction results for re-use. The standard approach (used e.g. by Docker and Podman) is a many-layered union filesystem, encoding differences between layers as tar archives. Our experiments show this performs similarly to layered caches on both build time and disk usage, with a considerable advantage for many-instruction recipes. Our approach also has structural advantages: better diff format, lower cache overhead, and better file de-duplication. These results show that a Git-based cache for layer-free container implementations is not only possible but may outperform the layered approach on important dimensions.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset