Abstract:
Kurzweil says, computers will enable people to live forever and doctors
will be doing backup of your memories by late 2030. This talk is not
about that, yet.
Instead, the remarkable drop in disk costs makes it possible and
attractive to retain past application states and store them for a long
time for mining or auditing.
A still open question is how to best organize the past state storage?
Split snapshots are a recent approach to past state storage that is
attractive for several reasons. Split snapshots are persistent, can be
taken with high-frequency, and they are transactionally consistent.
Unmodified database code can run against them.
Like no other past state storage approach, they provide low-cost
discriminated garbage collection of snapshots, a useful capability in
long-lived systems since since indiscriminately keeping all snapshots
accessible becomes impractical over time even if raw disk storage is cheap,
because administering such large-volume storage is expansive over long
duration.
A number of novel techniques underly split snapshots.
A new in-memory data-structure creates consistent copy-on-write snapshots
without blocking, a new persistent data structure provides high
performance versioned meta-data, and a new snapshot storage organization
allows to gradually garbage collect selected copy-on-write snapshots without
creating disk-fragmentation and without copying.
Measurements of a split snapshot prototype system indicate that the new
techniques are efficient and scalable, imposing minimal ($4\%$) performance
penalty on a storage system, on expected common workloads.
Bio:
Liuba Shrira is a Professor of Computer Science at Brandeis University and
doubles as Research Affiliate at MIT/CSAIL.
She received her B.Sc./M.Sc./D.Sc. from Computer Science Department at
Technion, what seems like yeasterday,
and is always delighted to visit her Alma Mater.