Tuesday, 27.3.2018, 10:30
Large-scale storage systems lie at the heart of the big data revolution.
As these systems grow in scale and capacity, their complexity grows
accordingly, building on new storage media, hybrid memory hierarchies,
and distributed architectures. Numerous layers of abstraction hide this
complexity from the applications, but also hide valuable information
that could improve the system's performance considerably.
I will demonstrate how to bridge this semantic gap in the context of
erasure codes, which are used to guarantee data availability and
durability. Current theoretical research efforts focus on codes that
will reduce the storage, network, and compute overheads of the systems
that use them, without sacrificing their reliability. However, the
semantic gap makes it difficult to observe the theoretical benefit of
the resulting codes in real implementations. I will follow the example
of regeneration and locally recoverable codes, showing the key
challenges in applying optimal erasure codes to real systems, and how
they can be addressed. This part is based on joint work with Matan
Liram, Oleg Kolosov, Eitan Yaakobi, Itzhak Tamo and Alexander Barg.
I will then briefly describe the challenges introduced by the semantic
gap in other layers of the "storage stack", and my experience in
addressing them. I will refer to the memory hierarchy, flash-based
solid-state drives, workload analysis, and aspects of data security.
Gala Yadgar is a senior researcher in the Computer Science Department at
the Technion, where she received her Ph.D in 2012, and a researcher in
the Department of Systems in the School of Electrical Engineering of Tel
Aviv University. She is an associate editor of ACM Transactions on
Storage, and serves on the program committees of SYSTOR, MSST, FAST, and
USENIX ATC. Her research is directed at methods for improving
performance and reliability of storage in large scale data centers,
focusing on complex hierarchies and enhanced storage interfaces.