Technical Report MSC-2019-15

TR#:MSC-2019-15
Class:MSC
Title: An OS page cache for hetrogeneous systems
Authors: Tanya Brokhman
Supervisors: Mark Silberstein
PDFCurrently accessibly only within the Technion network
Abstract: Efficient access to files from GPUs is of growing importance in data-intensive applications. Unfortunately, current OS design cannot provide core system services to GPU kernels, such as efficient access to memory mapped files, nor can it optimize I/O performance for CPU applications sharing files with GPUs. To mitigatethese limitations, much tighter integration of GPU memory into the OS page cache and file I/O mechanisms is required. Achieving such integration is one of the primary goals of this thesis.

We propose a principled approach to integrating GPU memory with an OS page cache. GAIA extends the CPU OS page cache to the physical memory of accelerators to enable seamless management of the distributed page cache (spanning CPU and GPU memories) by the CPU OS. We adopt a variation of CPU-managed lazy relaxed consistency shared memory model while maintaining compatibility with unmodified CPU programs. We highlight the main hardware and software interfaces to support this architecture, and show a number of optimizations, such as tight integration with the OS prefetcher, to achieve efficient peer-to-peer caching of file contents. GAIA enables the standard mmap system call to map files into the GPU address space, thereby enabling data-dependent GPU accesses to large files and efficient write-sharing between the CPU and GPUs. Under the hood, GAIA: 1. Integrates lazy release consistency among physical memories into the OS page cache while maintaining backward compatibility with CPU processes and unmodified GPU kernels. 2. Improves CPU I/O performance by using data cached in GPU memory. 3. Optimizes the readahead prefetcher to support accesses to caches in GPUs. We prototype GAIA in Linux and evaluate it on NVIDIA Pascal GPUs. We show up to 3× speedup in CPU file I/O and up to 8× in unmodified realistic workloads such as Gunrock GPU-accelerated graph processing, image collage, and microscopy image stitching.

CopyrightThe above paper is copyright by the Technion, Author(s), or others. Please contact the author(s) for more information

Remark: Any link to this technical report should be to this page (http://www.cs.technion.ac.il/users/wwwb/cgi-bin/tr-info.cgi/2019/MSC/MSC-2019-15), rather than to the URL of the PDF files directly. The latter URLs may change without notice.

To the list of the MSC technical reports of 2019
To the main CS technical reports page

Computer science department, Technion
admin