This article was firstly published from http://oliveryang.net. The content reuse need include the original link.
Similar with other virtualization technologies,Hyper-V creates a virtual platform for its virtual machine. This article tries to give an overview about the virtual platform provided by Hyper-V, and how Linux OS works in a guest VM. The contents here are limited to Hyper-V, but some basic virtualization concepts might be similar or same with other hypervisors.
Every hypervisor actually defines a clear specification for a virtual hardware inventory provisioned for a guest VM. For Hyper-V, Emulated and synthetic hardware specification for Windows Server 2012 Hyper-V is a such kind of specification. In order to have better understandings of the specification, there are some key concepts need to be clarified.
Today, most of hypervisors provide two virtualization types,
If a VM is working under this mode, OS without any virtualization knowledge can run over VM very well, just like running over the bare-metal hardware. With this virtualization type, application running over legacy OS like old Linux, Windows, and DOS could be easily migrated to a VM.
When a VM is working under this mode, the OS must have the virtualization knowledge, which means the OS is modified for cooperating with underlying hypervisors. The mechanism that allows guest OS to cooperate with hypervisor is called by a special terminology: hypercall.
At early phases, Para Virtualization usually provides the better performance than Full Virtualization on x86 platform. However, after Intel/AMD introduced the Hardware-assisted Virtualization extensions, such as Intel VT-X and AMD-V, the Full Virtualization VM could perform better than Para Virtualization VM while running the CPU or memory bound workload. Today, the major performance overheads in Full Virtualization VMs are caused by IO bound workloads.
For this reason, most of guest OS uses a Hybrid way to get better performance,
Hyper-V actually supports 3 type of devices in a guest VM,
Emulated Devices
The devices are emulated by Hyper-V and acting as the real hardware. That means drivers in guest OS just drive the hardware devices as same as they do for physical devices. A legacy OS without any Hyper-V knowledge wouldn’t work if emulated devices were not supported.
The emulated devices could be probed or detected by low level tools and firmware which are based on well known hardware/bus standards. For example, If a hardware is emulated as a PCI device, it will be detected by lspci
command in Linux guest OS.
The emulated devices usually have worst performance than other two types of devices, as the hardware emulation introduce lots of performance overheads in data path. When physical drivers access the emulated device registers, it will result a VM_Exit to call into the hypervisor device emulation code. Because VM_Exit overhead is bigger than a hypercall, and unmodified physical drivers never avoid such overheads, the emulated devices performance never can be better than other two device types.
Synthetic Devices
The devices are not emulated devices, and there are no such kind of hardware devices in real world. Instead, the devices are created by Hyper-V, and only can be driven by PV drivers in guest OS.
There are no ways to probe the Synthetic Devices without a virtual bus driver in a guest OS. The virtual bus driver must provide an OS independent standardrized protocols to probe the devices on the bus.
The synthetic devices usually have better performance than emulated devices because virtual drivers in guest can cooperate with hypervisor to avoid lots of overheads than device emulations. When PV drivers operate the synthetic devices, it uses the hypercall to cooperate with hypervisor. Hypercall has less oeverheads than VM_Exit as we mentioned above. Another important thing is, PV drivers are purpose built for virtualization, and they are fully optimized to avoid unnecessary hypercalls. These are major reasons for better performance than emulated device.
Direct Pass-through Devices
The devices are real hardware devices which are directly assigned to a VM. That means device drivers in guest OS can drive the hardware by the same way with bare-metal environments. In order to make guest VM have good isolation for stability concerns, only modern hardware with IOMMU support could provide better stabilities to the VM. Otherwise, a buggy hardware or drivers could result a memory corruption against other VMs or hypervisor. For example, assigned a SRIOV VF(Virtual Function) to a VM is supported by Hyper-V.
The direct pass-through devices have best performance than two other types of devices. This is a major reason that users want to use direct pass-through devices. In this case, both VM_Exit and hypercall are avoided in major IO data path. The performance is as good as native device performance in bare-metal OS. However, the direct pass-through devices usage in VMs could cause live migration difficulties. Users need some trade-offs between VM performance and flexibilities.
In July 2009, Microsoft submitted Hyper-V drivers to Linux kernel in Linux 2.6.32. That means, prior to Linux 2.6.32,Linux OS could only be run under Hyper-V VM with Full Virtualization mode with awful performance. But after Linux 2.6.32, Linux could run with Hyper-V PV(Para Virtualization) Drivers which lead better performance.
Microsoft called their PV drivers by LIS(Linux Integration Services) Drivers. The latest LIS driver code is always available from Linux kernel mainline. However, Microsoft is also make sure some necessary LIS driver features and bug fixes could be back-ported to old Linux Distributions(RHEL releases). The LIS backporting project is hosted as a separate open source project from Github.
In Hyper-V VM, Linux matches physical drivers with the exact same way with bare-metal platform. The physical drivers could be used for both emulated devices or direct path-through devices.
The Emulated and synthetic hardware specification gives the full hardware inventory for a Hyper-V VM. Then it should be quite easy to find out which Linux physical drivers could drive the emulated devices in a Linux guest.
For example, on Windows Server 2012, Hyper-V provides following emulated devices for guest VMs, and their physical drivers could be easily identified by PCI vendor and device IDs,
Emulated Device | Linux Physical Drivers |
---|---|
4 IDE Controllers (Intel 82371AB/EB/MB PIIX4 IDE) | ide-generic(legacy) driver or ata_piix(PATA/SATA) driver |
Multiport DEC 21140 10/100TX 100 MB Ethernet network adapter | tulip driver |
Please note that Linux has two different physical IDE drivers for supporting IDE controller,
Legacy IDE driver (ide-generic)
The driver is not recommended on Hyper-V. When this driver is loaded, the IDE disks are named by /dev/hd[a|b|c|d]
.
Intel PATA/SATA driver (ata_piix)
The driver is recommended if you don’t want to use PV driver. When this driver is loaded, the IDE disks are named by /dev/sd[a|b|c|d]
. The ata_piix uses the device names same with SCSI disks, because libata library is a part of Linux SCSI stack, which eventually need cooperate with SCSI mid layer and SD driver.
In Linux 4.6, PCI device pass-through is introduced recently. For this case, the device drivers binding for the PCI devices of the VMs are exactly same with the bare-metal platform.
However, per this patch, the PCI direct pass-through also needs the PCI front-end driver support in Linux guest OS. The PCI front-end driver creates the virtual PCI bus for guest VMs, and works with PCI back-end driver to handle the IRQ and IOMMU mappings in hypervisor side.
PV drivers are used to drive Synthetic Devices. LIS project contributed a set of Hyper-V PV drivers into Linux kernel. All PV drivers actually are children of VMbus PV drivers, which is responsible for provide a communication mechanism between VSC(Virtual Service Client) in guest VM and VSP(Virtual Service Provider) from Hypervisors. The mechanism here include event channel, ring-buffer APIs for other PV drivers. VMbus driver also covers Synthetic Interrupt Controller and TSC clock source handling.
Today, Linux OS is fully optimized for Hyper-V and the PV drivers are playing the important roles on supporting performance and functionalities features for Linux guest. Below table is a high level overview about LIS drivers name and their basic functionalities,
PV Drivers Name | Basic Functionalities |
---|---|
Hyper-V VMbus driver | Communication mechanisms between VSC(Virtual Service Client) in guest VM and VSP(Virtual Service Provider) from Hypervisors |
Hyper-V virtual storage driver | Guest native IDE, SCSI, FC storage protocols support |
Hyper-V virtual network driver | Virtual NIC driver support with offload features(Jumbo Frames, VLAN tagging and trunking) |
Hyper-V PCI Frontend driver | PCI pass through support for virtual PCI fabric setup in guest VM, IRQ and IOMMU setup in Hypervisors |
Hyper-V Balloon driver | Memory ballooning: steal memory from Hyper-V guest VMs |
Hyper-V Utilities driver | Heart beat, Time sync, KVP (Key Value Pair) Exchange, Live VM backup(VSS), File copy service, user/kernel transport |
Hyper-V Synthetic Video Frame Buffer Driver | Synthetic video device support for better graphic performance |
Hyper-V Keyboard driver | Full keyboard support for Linux guest, better keyboard mapping |
Hyper-V mouse driver | Full mouse support for Linux guest |
Overall, the major differences between Physical Drivers and PV drivers are,
Physical drivers drive emulated devices and direct pass-through devices, whereas the PV drivers drive synthetic devices. PV drivers for synthetic devices always have better performance and features than physical drivers for emulated devices.
Physical drivers are children of physical hardware bus(IDE,PCI etc.), whereas the PV drivers are children of VMbus which presented by Hyper-V VMbus driver. PV drivers could leverage VMbus driver to communicate with hypervisor from Linux guest OS.
Because of the virtual hardware mechanism differences, devices in Hyper-V VMs may have some hard limitations.
Hyper-V for windows server 2012 only supports booting from IDE disks instead of booting from SCSI disks or external SAN storages.
The major reason of this limitation is, Hyper-V for windows server 2012 doesn’t implement an emulated SCSI controller for its guest VMs. That means, when VMs get booted from BIOS loading phase, the BIOS has no way to probe SCSI disks as they are not emulated devices. Instead, SCSI controllers are implemented as synthetic devices. During the BIOS runtime, only IDE disks could be probed by BIOS per standard IDE probing protocols.
After Linux OS get controls from BIOS, Linux Hyper-V VMbus driver and virtual storage driver(hv_storvsc) got loaded. Then the hv_storvsc driver has full knowledge to drive IDE, SCSI disks and FC storages per different protocols.
For this reason, on Hyper-V for windows server, SCSI disks and FC storages can only be used for saving the data instead of saving the OS rootfs. However, Hyper-V for virtual server support boot from SCSI disks by the emulated SCSI controller, which means the limitation got removed.
Another problem is, in a Linux OS, there are 3 drivers that can control IDE disks, how can we avoid the conflicts here?
First of all, legacy ide-generic and ata_piix drivers could be controlled by different kernel Kconfig options at build time or boot options in grub. Most of new Linux distributions usually disable legacy IDE drivers and use ata_piix driver by default.
Second of all, Linux changed the ata_piix code to avoid the conflicts when Linux OS already enabled the Hyper-V virtual storage driver.
The last concern is IDE disks capacity limitation. Hyper-V’s IDE disk is now 48-bit LBA capable. This allows you to connect large VHDs up to 2040GB to it.
PXE boot is the similar case with boot disk. Hyper-V only supports one type of emulated devices for NIC controller. As PXE boot is implemented at BIOS running period, Hyper-V can only boot from a very old NIC controller: DEC 21140 10/100TX 100 MB Ethernet network adapter.
The Hyper-V virtual network driver can’t be used here as VM BIOS runtime phase has no PV drivers support.