Arthur Kiyanovski, M.Sc. Thesis Seminar
Wednesday, 28.6.2017, 13:00
Machine virtualization has grown in popularity in recent years with the growth of cloud
computing. Virtual machines use virtual I/O devices to perform their I/O. Nowadays
paravirtual I/O devices are the most popular type of virtual I/O devices due to their
high performance and interposition capabilities. However paravirtual I/O devices also
have disadvantages. Users need to install device drivers for paravirtual devices whenever
they switch hypervisors, and hypervisor providers need to implement device drivers for
all operating systems.
Emulated I/O devices also allow interposition, and do not have the disadvantages
of paravirtual I/O devices as they are designed to work with the device drivers of the
physical devices they emulate. These device drivers come preinstalled in all major
operating systems, which makes the task of switching hypervisors much easier for the
users. And since the device drivers have already been written for the physical devices
being emulated, hypervisor providers need not implement device drivers for emulated
devices. However emulated I/O devices achieve substantially lower performance than
paravirtual ones, which makes them unusable in many real world scenarios.
Previous works state that the main reason for the performance difference between
paravirtual and emulated I/O devices is the larger amount of exits caused by the latter.
To test this claim we created a model that estimates the maximum possible throughput
that can be achieved by QEMU’s emulated e1000 NIC, when taking the throughput
of the paravirtual virtio-net NIC and adding the overhead of e1000’s extra exits. This
model predicts a throughput difference of only 1.13X in favor of virtio-net, which is very
different from the 20X throughput difference achieved in practice. This result led us to
search for reasons other than exits that could explain this difference.
In this work we present differences between QEMU’s virtio-net and e1000 other than
exits, which we found contributing to the throughput gap between the two. For each
difference we propose an improvement to e1000, inspired by virtio-net’s implementation.
We then use the sidecore paradigm to reduce part of the exits caused by e1000 to further
improve the throughput of e1000. We were able to reduce the throughput gap between
vritio-net and e1000 down to 1.2X when the guest runs on a single core and to 1.25X on
a dual core with our sidecore. Our results show that emulated I/O devices can achieve
performace that is close to that of paravirtual ones, which might make emulated devices
the better choice when flexibilty is more important than best performance.