Cooperative Cache Optimization for HPC Using Binary Tree Overlay with Linux FUSE

Session Number

CMPS 02

Advisor(s)

Kevin Harms, Argonne National Laboratory

Discipline

Computer Science

Start Date

17-4-2024 10:45 AM

End Date

17-4-2024 11:00 AM

Abstract

High-performance computing (HPC) applications often suffer performance degradation due to contention on storage servers when multiple compute nodes access small files. This paper proposes a cooperative cache layer to alleviate bottlenecks by funneling I/O requests through a specialized service on a single node, distributing results via a binary tree overlay network.

Objectives include reducing load on storage servers, increasing cache hit ratio, and minimizing network traffic. The study outlines implementation, evaluation using benchmarks and real-world applications, and comparison with alternatives. Progress includes successful adaptation of Linux FUSE, development of CuFUSE translation program, and initial caching program development. Future work involves testing to refine efficiency and performance. Despite challenges, the study aims to optimize HPC storage systems, with implications for various applications, including loading Python modules in high-performance environments.

Share

COinS
 
Apr 17th, 10:45 AM Apr 17th, 11:00 AM

Cooperative Cache Optimization for HPC Using Binary Tree Overlay with Linux FUSE

High-performance computing (HPC) applications often suffer performance degradation due to contention on storage servers when multiple compute nodes access small files. This paper proposes a cooperative cache layer to alleviate bottlenecks by funneling I/O requests through a specialized service on a single node, distributing results via a binary tree overlay network.

Objectives include reducing load on storage servers, increasing cache hit ratio, and minimizing network traffic. The study outlines implementation, evaluation using benchmarks and real-world applications, and comparison with alternatives. Progress includes successful adaptation of Linux FUSE, development of CuFUSE translation program, and initial caching program development. Future work involves testing to refine efficiency and performance. Despite challenges, the study aims to optimize HPC storage systems, with implications for various applications, including loading Python modules in high-performance environments.