Abstract:
The use of clusters and grids as high capability and capacity computers
is rapidly growing in the industry, academia, and government. This
growth is accompanied by fast-paced progress in cluster-aware hardware,
and in particular in interconnection technology. Contemporary networks
offer not only excellent performance as expressed by latency and
bandwidth, but also advanced architectural features, such as
programmable network interface cards, hardware support for collective
communication operations, and support for modern communication protocols
such as MPI and RDMA.
These network mechanisms pave the way to advances in system software for
large-scale clusters and grids. Such machines are typically composed of
loosely-coupled independent compute nodes, each running a local
operating system such as Linux. Such solutions are inadequate for many
large-scale system tasks, such as resource management, job scheduling,
and fault tolerance.
Our research at Los Alamos National Laboratory has focused on leveraging
the features of modern interconnects to address these issues in a
global, cohesive view. As part of this work, we have implemented two
novel job scheduling algorithms,that make use of advanced collective
communication capabilities. We have also implemented some of the more
traditional job scheduling algorithms, and compared the performance of
these algorithms in several scenarios and cluster architectures. This
talk presents an overview of these job scheduling algorithms and the
main experimental results. In particular, we show how issues such as
load-imbalance and resource overlapping can be addressed by novel
job-scheduling techniques.
joint work with Dror Feitelson (HUJI) and Fabrizio Petrini (LANL)