Radix page tables as implemented in the x86-64 micro-architecture incur a
penalty of four memory references on each TLB miss. The problem aggravates in virtualized environments with nested page tables where every page walk requires 24 memory references. The virtual memory overhead on guest performance can approach 90\% in servers or scientific applications.
Trying to mitigate the cost of TLB misses hardware vendors have added MMU caches that store partial translations. Current MMU caches exploit the reuse of page table entries to accelerate native address translation. Extending those caches to support 2D page walks in virtualized systems will make the hardware more complicated and power consuming.
We propose using hashed page tables for both native and virtualized systems. A recent study have concluded that hashed page tables increase the number of DRAM accesses per walk by over 400\%. However we show that properly designed hashed page tables are even superior to the radix page tables augmented with MMU caches. Our results indicate that hash-based page tables are particularly effective for virtualized systems and nested virtualization.