Yuval Shimron, M.Sc. Thesis Seminar
Wednesday, 15.6.2011, 14:00
In dealing with the container bloat problem, we identify five memory compaction techniques,
which can be used to reduce the footprint of the small objects that make these containers.
Using these techniques, we describe two alternative methods for more efficient encoding of
the JRE's ubiquitous HashMap data structure, and present a mathematical model in which the
footprint of this can be analyzed.
First of this is our fused-hashing encoding method, which reduces memory overhead by 20%-40%
on a 32-bit environment and 45%-65% on a 64-bit environment. This encoding guarantees
these figures as lower bound regardless of the distribution of keys in hash buckets.
Second, our more opportunistic squashed-hashing, achieves expected savings of 25%-70% on a
32-bit environment and 30%-75% on a 64-bit environments, but these savings can degrade and
are not guaranteed against bad (and unlikely) distribution of keys to buckets.
Timing results indicate that squashed hashing generally improves the SPECjbb2005 benchmark,
but some of the operations, particularly removals and iterations are slow.
We still find squashed hashing is faster then the baseline implementation for large tables
indexed by Integers.
We note that with the use of our compaction techniques, there is merit in an implementation
of HashSet which is distinct from that of HashMap.
For TreeMap we show two encodings which reduces the overhead of tree nodes by ~43%, ~46%
on a 32-bit environment and ~55%, ~73% on a 64-bit environment. These compactions also
give a reason to separating the implementation of TreeSet from that of TreeMap.
A separete implementation is expected to increase the footprint reduction to ~59%, ~54%
on a 32-bit environment and ~61%, ~75% on a 64-bit environment.