Wednesday, 29.5.2019, 11:30
Electrical Eng. Building 861
Graph processing is typically considered to be a memory-bound rather than compute-bound problem. One common line of thought is that more available memory bandwidth corresponds to better graph processing performance. However, in this talk, I will show that this is not necessarily the case. I will demonstrate that the key factor in the utilization of the memory system for graph algorithms is not the raw bandwidth, or even latency of memory requests, but instead is the number of memory channels available to handle small data transfers with low locality.
This work was done in collaboration with James, Dr. Jeff Young, Dr. Jun Shirako, and Prof. David Bader.
Using several widely used graph frameworks, including Gunrock (on the GPU) and GAPBS & Ligra (for CPUs), we characterize two very distinct memory hierarchies with respect to key graph analytics kernels. Our results show that the differences in peak bandwidths of several of the Pascal-generation GPU memory subsystems aren't reflected in the performance of various analytics. Furthermore, our experiments on CPU and Xeon Phi systems show that the number of memory channels utilized can be a decisive factor in performance across several different applications. For CPU systems with smaller thread counts, the memory channels can be underutilized while systems with high thread counts can oversaturate the memory subsystem, which leads to limited performance. Lastly, we model the performance of including more channels with narrower access widths than those found in existing memory subsystems, and we analyze the trade-offs in terms of the two most prominent types of memory accesses found in graph algorithms, streaming and random accesses.
Dr. Oded Green is a Senior Graph Software Engineer with NVIDIA's AI infrastructure. Oded is also currently an Adjunct Research Scientist at the Georgia Institute of Technology (Georgia Tech) in Computational Sciences and Engineering, where he also received his PhD. Oded received both his MSc in electrical engineering and his BSc in computer engineering from Technion – Israel Institute of Technology
Oded's research primarily focuses on improving the performance and scalability of large-scale data analytics, with an emphasis on static & dynamic graph analytics. In recent years, Oded has also worked on designing and implementing efficient sorting algorithms for a wide range of accelerators, including GPUs. Oded is also very interested in architecture-algorithm codesign.