You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

381 lines

  1. [[qm_pci_passthrough]]
  2. PCI(e) Passthrough
  3. ------------------
  4. ifdef::wiki[]
  5. :pve-toplevel:
  6. endif::wiki[]
  7. PCI(e) passthrough is a mechanism to give a virtual machine control over
  8. a PCI device from the host. This can have some advantages over using
  9. virtualized hardware, for example lower latency, higher performance, or more
  10. features (e.g., offloading).
  11. But, if you pass through a device to a virtual machine, you cannot use that
  12. device anymore on the host or in any other VM.
  13. General Requirements
  14. ~~~~~~~~~~~~~~~~~~~~
  15. Since passthrough is a feature which also needs hardware support, there are
  16. some requirements to check and preparations to be done to make it work.
  17. Hardware
  18. ^^^^^^^^
  19. Your hardware needs to support `IOMMU` (*I*/*O* **M**emory **M**anagement
  20. **U**nit) interrupt remapping, this includes the CPU and the mainboard.
  21. Generally, Intel systems with VT-d, and AMD systems with AMD-Vi support this.
  22. But it is not guaranteed that everything will work out of the box, due
  23. to bad hardware implementation and missing or low quality drivers.
  24. Further, server grade hardware has often better support than consumer grade
  25. hardware, but even then, many modern system can support this.
  26. Please refer to your hardware vendor to check if they support this feature
  27. under Linux for your specific setup.
  28. Configuration
  29. ^^^^^^^^^^^^^
  30. Once you ensured that your hardware supports passthrough, you will need to do
  31. some configuration to enable PCI(e) passthrough.
  32. .IOMMU
  33. The IOMMU has to be activated on the
  34. xref:sysboot_edit_kernel_cmdline[kernel commandline].
  35. The command line parameters are:
  36. * for Intel CPUs:
  37. +
  38. ----
  39. intel_iommu=on
  40. ----
  41. * for AMD CPUs:
  42. +
  43. ----
  44. amd_iommu=on
  45. ----
  46. .Kernel Modules
  47. You have to make sure the following modules are loaded. This can be achieved by
  48. adding them to `'/etc/modules''
  49. ----
  50. vfio
  51. vfio_iommu_type1
  52. vfio_pci
  53. vfio_virqfd
  54. ----
  55. [[qm_pci_passthrough_update_initramfs]]
  56. After changing anything modules related, you need to refresh your
  57. `initramfs`. On {pve} this can be done by executing:
  58. ----
  59. # update-initramfs -u -k all
  60. ----
  61. .Finish Configuration
  62. Finally reboot to bring the changes into effect and check that it is indeed
  63. enabled.
  64. ----
  65. # dmesg | grep -e DMAR -e IOMMU -e AMD-Vi
  66. ----
  67. should display that `IOMMU`, `Directed I/O` or `Interrupt Remapping` is
  68. enabled, depending on hardware and kernel the exact message can vary.
  69. It is also important that the device(s) you want to pass through
  70. are in a *separate* `IOMMU` group. This can be checked with:
  71. ----
  72. # find /sys/kernel/iommu_groups/ -type l
  73. ----
  74. It is okay if the device is in an `IOMMU` group together with its functions
  75. (e.g. a GPU with the HDMI Audio device) or with its root port or PCI(e) bridge.
  76. .PCI(e) slots
  77. [NOTE]
  78. ====
  79. Some platforms handle their physical PCI(e) slots differently. So, sometimes
  80. it can help to put the card in a another PCI(e) slot, if you do not get the
  81. desired `IOMMU` group separation.
  82. ====
  83. .Unsafe interrupts
  84. [NOTE]
  85. ====
  86. For some platforms, it may be necessary to allow unsafe interrupts.
  87. For this add the following line in a file ending with `.conf' file in
  88. */etc/modprobe.d/*:
  89. ----
  90. options vfio_iommu_type1 allow_unsafe_interrupts=1
  91. ----
  92. Please be aware that this option can make your system unstable.
  93. ====
  94. GPU Passthrough Notes
  95. ^^^^^^^^^^^^^^^^^^^^^
  96. It is not possible to display the frame buffer of the GPU via NoVNC or SPICE on
  97. the {pve} web interface.
  98. When passing through a whole GPU or a vGPU and graphic output is wanted, one
  99. has to either physically connect a monitor to the card, or configure a remote
  100. desktop software (for example, VNC or RDP) inside the guest.
  101. If you want to use the GPU as a hardware accelerator, for example, for
  102. programs using OpenCL or CUDA, this is not required.
  103. Host Device Passthrough
  104. ~~~~~~~~~~~~~~~~~~~~~~~
  105. The most used variant of PCI(e) passthrough is to pass through a whole
  106. PCI(e) card, for example a GPU or a network card.
  107. Host Configuration
  108. ^^^^^^^^^^^^^^^^^^
  109. In this case, the host must not use the card. There are two methods to achieve
  110. this:
  111. * pass the device IDs to the options of the 'vfio-pci' modules by adding
  112. +
  113. ----
  114. options vfio-pci ids=1234:5678,4321:8765
  115. ----
  116. +
  117. to a .conf file in */etc/modprobe.d/* where `1234:5678` and `4321:8765` are
  118. the vendor and device IDs obtained by:
  119. +
  120. ----
  121. # lspci -nn
  122. ----
  123. * blacklist the driver completely on the host, ensuring that it is free to bind
  124. for passthrough, with
  125. +
  126. ----
  127. blacklist DRIVERNAME
  128. ----
  129. +
  130. in a .conf file in */etc/modprobe.d/*.
  131. For both methods you need to
  132. xref:qm_pci_passthrough_update_initramfs[update the `initramfs`] again and
  133. reboot after that.
  134. .Verify Configuration
  135. To check if your changes were successful, you can use
  136. ----
  137. # lspci -nnk
  138. ----
  139. and check your device entry. If it says
  140. ----
  141. Kernel driver in use: vfio-pci
  142. ----
  143. or the 'in use' line is missing entirely, the device is ready to be used for
  144. passthrough.
  145. [[qm_pci_passthrough_vm_config]]
  146. VM Configuration
  147. ^^^^^^^^^^^^^^^^
  148. To pass through the device you need to set the *hostpciX* option in the VM
  149. configuration, for example by executing:
  150. ----
  151. # qm set VMID -hostpci0 00:02.0
  152. ----
  153. If your device has multiple functions (e.g., ``00:02.0`' and ``00:02.1`' ),
  154. you can pass them through all together with the shortened syntax ``00:02`'
  155. There are some options to which may be necessary, depending on the device
  156. and guest OS:
  157. * *x-vga=on|off* marks the PCI(e) device as the primary GPU of the VM.
  158. With this enabled the *vga* configuration option will be ignored.
  159. * *pcie=on|off* tells {pve} to use a PCIe or PCI port. Some guests/device
  160. combination require PCIe rather than PCI. PCIe is only available for 'q35'
  161. machine types.
  162. * *rombar=on|off* makes the firmware ROM visible for the guest. Default is on.
  163. Some PCI(e) devices need this disabled.
  164. * *romfile=<path>*, is an optional path to a ROM file for the device to use.
  165. This is a relative path under */usr/share/kvm/*.
  166. .Example
  167. An example of PCIe passthrough with a GPU set to primary:
  168. ----
  169. # qm set VMID -hostpci0 02:00,pcie=on,x-vga=on
  170. ----
  171. Other considerations
  172. ^^^^^^^^^^^^^^^^^^^^
  173. When passing through a GPU, the best compatibility is reached when using
  174. 'q35' as machine type, 'OVMF' ('EFI' for VMs) instead of SeaBIOS and PCIe
  175. instead of PCI. Note that if you want to use 'OVMF' for GPU passthrough, the
  176. GPU needs to have an EFI capable ROM, otherwise use SeaBIOS instead.
  177. SR-IOV
  178. ~~~~~~
  179. Another variant for passing through PCI(e) devices, is to use the hardware
  180. virtualization features of your devices, if available.
  181. 'SR-IOV' (**S**ingle-**R**oot **I**nput/**O**utput **V**irtualization) enables
  182. a single device to provide multiple 'VF' (**V**irtual **F**unctions) to the
  183. system. Each of those 'VF' can be used in a different VM, with full hardware
  184. features and also better performance and lower latency than software
  185. virtualized devices.
  186. Currently, the most common use case for this are NICs (**N**etwork
  187. **I**nterface **C**ard) with SR-IOV support, which can provide multiple VFs per
  188. physical port. This allows using features such as checksum offloading, etc. to
  189. be used inside a VM, reducing the (host) CPU overhead.
  190. Host Configuration
  191. ^^^^^^^^^^^^^^^^^^
  192. Generally, there are two methods for enabling virtual functions on a device.
  193. * sometimes there is an option for the driver module e.g. for some
  194. Intel drivers
  195. +
  196. ----
  197. max_vfs=4
  198. ----
  199. +
  200. which could be put file with '.conf' ending under */etc/modprobe.d/*.
  201. (Do not forget to update your initramfs after that)
  202. +
  203. Please refer to your driver module documentation for the exact
  204. parameters and options.
  205. * The second, more generic, approach is using the `sysfs`.
  206. If a device and driver supports this you can change the number of VFs on
  207. the fly. For example, to setup 4 VFs on device 0000:01:00.0 execute:
  208. +
  209. ----
  210. # echo 4 > /sys/bus/pci/devices/0000:01:00.0/sriov_numvfs
  211. ----
  212. +
  213. To make this change persistent you can use the `sysfsutils` Debian package.
  214. After installation configure it via */etc/sysfs.conf* or a `FILE.conf' in
  215. */etc/sysfs.d/*.
  216. VM Configuration
  217. ^^^^^^^^^^^^^^^^
  218. After creating VFs, you should see them as separate PCI(e) devices when
  219. outputting them with `lspci`. Get their ID and pass them through like a
  220. xref:qm_pci_passthrough_vm_config[normal PCI(e) device].
  221. Other considerations
  222. ^^^^^^^^^^^^^^^^^^^^
  223. For this feature, platform support is especially important. It may be necessary
  224. to enable this feature in the BIOS/EFI first, or to use a specific PCI(e) port
  225. for it to work. In doubt, consult the manual of the platform or contact its
  226. vendor.
  227. Mediated Devices (vGPU, GVT-g)
  228. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  229. Mediated devices are another method to reuse features and performance from
  230. physical hardware for virtualized hardware. These are found most common in
  231. virtualized GPU setups such as Intels GVT-g and Nvidias vGPUs used in their
  232. GRID technology.
  233. With this, a physical Card is able to create virtual cards, similar to SR-IOV.
  234. The difference is that mediated devices do not appear as PCI(e) devices in the
  235. host, and are such only suited for using in virtual machines.
  236. Host Configuration
  237. ^^^^^^^^^^^^^^^^^^
  238. In general your card's driver must support that feature, otherwise it will
  239. not work. So please refer to your vendor for compatible drivers and how to
  240. configure them.
  241. Intels drivers for GVT-g are integrated in the Kernel and should work
  242. with 5th, 6th and 7th generation Intel Core Processors, as well as E3 v4, E3
  243. v5 and E3 v6 Xeon Processors.
  244. To enable it for Intel Graphics, you have to make sure to load the module
  245. 'kvmgt' (for example via `/etc/modules`) and to enable it on the
  246. xref:sysboot_edit_kernel_cmdline[Kernel commandline] and add the following parameter:
  247. ----
  248. i915.enable_gvt=1
  249. ----
  250. After that remember to
  251. xref:qm_pci_passthrough_update_initramfs[update the `initramfs`],
  252. and reboot your host.
  253. VM Configuration
  254. ^^^^^^^^^^^^^^^^
  255. To use a mediated device, simply specify the `mdev` property on a `hostpciX`
  256. VM configuration option.
  257. You can get the supported devices via the 'sysfs'. For example, to list the
  258. supported types for the device '0000:00:02.0' you would simply execute:
  259. ----
  260. # ls /sys/bus/pci/devices/0000:00:02.0/mdev_supported_types
  261. ----
  262. Each entry is a directory which contains the following important files:
  263. * 'available_instances' contains the amount of still available instances of
  264. this type, each 'mdev' use in a VM reduces this.
  265. * 'description' contains a short description about the capabilities of the type
  266. * 'create' is the endpoint to create such a device, {pve} does this
  267. automatically for you, if a 'hostpciX' option with `mdev` is configured.
  268. Example configuration with an `Intel GVT-g vGPU` (`Intel Skylake 6700k`):
  269. ----
  270. # qm set VMID -hostpci0 00:02.0,mdev=i915-GVTg_V5_4
  271. ----
  272. With this set, {pve} automatically creates such a device on VM start, and
  273. cleans it up again when the VM stops.
  274. ifdef::wiki[]
  275. See Also
  276. ~~~~~~~~
  277. * link:/wiki/Pci_passthrough[PCI Passthrough Examples]
  278. endif::wiki[]