Elad Gidron, M.Sc. Thesis Seminar
Emerging computer architectures pose many new challenges for software development. First, as the number
of computing elements constantly increases, the importance of scalability of parallel programs becomes
more significant. Second, accessing memory has become the principal bottleneck, while multi-CPU systems
are based on NUMA architectures, where memory access from different chips is asymmetric. Therefore,
it is important to design software with local data access, cache-friendliness, and reduced contention
on shared memory locations, especially across chips.
In our work we focus on two problems:
1. We design and implement a scalable and highly-efficient non-blocking consumer-producer task pool,
with lightweight synchronization-free operations in the common case. Its data allocation scheme is
cache-friendly and highly suitable for NUMA environments. Moreover, our pool is robust in the face
of imbalanced loads and unexpected thread stalls.
2. We consider the case of improving metadata locality in word-based STMs. To this end, we evaluate
a locality-conscious approach for maintaining versioned locks in TL2. The speedup of the improved
algorithm reaches a hundred percent on STAMP benchmarks. We show that this speedup stems from the
following factors: 1) improved spacial and temporal locality, 2) reduced false sharing and 3) less