Skip to content (access key 's')
Logo of Technion
Logo of CS Department
Logo of CS4People
Events

The Taub Faculty of Computer Science Events and Talks

Exploring Advanced Cache Algorithms for the TLB
event speaker icon
Ella Sheory (M.Sc. Thesis Seminar)
event date icon
Tuesday, 04.04.2023, 14:30
event location icon
Taub 301
event speaker icon
Advisor: Prof. Dan Tsafrir
The translation lookaside buffer (``TLB’’) is a small cache that accelerates virtual to physical address translation, which processors typically manage with variants of the least recently used (``LRU’’) algorithm. Although LRU is simple, it is suboptimal for some workloads. Our analysis shows that if the processor uses the optimal---but impractical---Belady algorithm instead of LRU, runtime improves by up to 15% (and 5% on average) over LRU in single-thread (``ST’’) mode. Runtime further improves by up to 23% (yet still 5% on average) in simultaneous multithreading (``SMT’’) mode, where the TLB is competitively shared between two threads. Given this background, we observe that while past research developed various practical caching algorithms to outperform LRU, such research was typically evaluated in the context of regular data caches rather than the TLB. Our goal in this study is therefore to investigate whether advanced practical caching algorithms improve TLB performance and, if so, to what extent they can approach Belady performance. To this end, we classify existing caching algorithms into three groups based on their additional storage requirements: small (less than 10% of the LRU-managed TLB), medium (between 10% and 50%), and high (more than 50%). From each group, we select a representative algorithm. We find that the representatives of the small and medium groups improve performance by only 1%, on average (and no more than 11%). We also find that the third algorithm's complexity and sophistication are unjustified, as we can achieve the aforementioned optimal Belady improvement by handing the associated additional storage to the simpler LRU baseline. We thus conclude that no existing practical caching algorithm can meaningfully improve LRU TLB performance.