本文總結了Centos-KVM作爲宿主機,centos-vpp作爲客戶機時的網卡passthrough和SRIOV的配置方法
配置網卡直通
一、宿主機上修改GRUB,增加intel_iommu=on。刷新GRUB。然後重啓宿主機。
[root@kvm-02 ~]# vi /etc/default/grub
[root@kvm-02 ~]# grub2-mkconfig -o /boot/efi/EFI/centos/grub.cfg
Generating grub configuration file ...
Found linux image: /boot/vmlinuz-3.10.0-693.el7.x86_64
Found initrd image: /boot/initramfs-3.10.0-693.el7.x86_64.img
Found linux image: /boot/vmlinuz-0-rescue-20fe28cd4f4b4fa1b140c6a72d10ae05
Found initrd image: /boot/initramfs-0-rescue-20fe28cd4f4b4fa1b140c6a72d10ae05.im
g
done
(注:刷新grub的方法與系統引導方式有關,如果是legacy模式,就是/boot/grub2/grub.cfg,如果是uefi模式,則爲/boot/efi/EFI/centos/grub.cfg)
[root@kvm-02 ~]# reboot
等啓動完了之後
[root@kvm-02 ~]# cat /proc/cmdline
BOOT_IMAGE=/vmlinuz-3.10.0-693.el7.x86_64 root=/dev/mapper/centos-root ro crashkernel=auto rd.lvm.lv=centos/root rd.lvm.lv=centos/swap intel_iommu=on isolcpus=20-23 nohz_full=20-23 rcu_nocbs=20-23 nmi_watchdog=0 selinux=0 intel_pstate=disable nosoftlockup rhgb quiet
[root@kvm-02 ~]#
二、通過virsh命令將PCI從宿主機分離
[root@kvm-02 ~]# lspci -nn | grep net
09:00.0 Ethernet controller [0200]: Intel Corporation Ethernet Connection X722 for 10GbE SFP+ [8086:37d3] (rev 09)
09:00.1 Ethernet controller [0200]: Intel Corporation Ethernet Connection X722 for 10GbE SFP+ [8086:37d3] (rev 09)
09:00.2 Ethernet controller [0200]: Intel Corporation Ethernet Connection X722 for 10GbE SFP+ [8086:37d3] (rev 09)
09:00.3 Ethernet controller [0200]: Intel Corporation Ethernet Connection X722 for 10GbE SFP+ [8086:37d3] (rev 09)
2f:00.0 Ethernet controller [0200]: Intel Corporation Ethernet Controller X710 for 10GbE SFP+ [8086:1572] (rev 01)
2f:00.1 Ethernet controller [0200]: Intel Corporation Ethernet Controller X710 for 10GbE SFP+ [8086:1572] (rev 01)
31:00.0 Ethernet controller [0200]: Intel Corporation Ethernet Controller XXV710 for 25GbE SFP28 [8086:158b] (rev 02)
31:00.1 Ethernet controller [0200]: Intel Corporation Ethernet Controller XXV710 for 25GbE SFP28 [8086:158b] (rev 02)
58:00.0 Ethernet controller [0200]: Intel Corporation Ethernet Controller X710 for 10GbE SFP+ [8086:1572] (rev 01)
58:00.1 Ethernet controller [0200]: Intel Corporation Ethernet Controller X710 for 10GbE SFP+ [8086:1572] (rev 01)
86:00.0 Ethernet controller [0200]: Broadcom Limited NetXtreme BCM5720 Gigabit Ethernet PCIe [14e4:165f]
86:00.1 Ethernet controller [0200]: Broadcom Limited NetXtreme BCM5720 Gigabit Ethernet PCIe [14e4:165f]
af:00.0 Ethernet controller [0200]: Intel Corporation Ethernet Controller X710 for 10GbE SFP+ [8086:1572] (rev 01)
af:00.1 Ethernet controller [0200]: Intel Corporation Ethernet Controller X710 for 10GbE SFP+ [8086:1572] (rev 01)
virsh nodedev認的PCI格式稍微有點區別,用virsh再顯示一遍
[root@kvm-02 ~]# virsh nodedev-list --tree | grep 09
| +- pci_0000_09_00_0
| +- pci_0000_09_00_1
| +- pci_0000_09_00_2
| +- pci_0000_09_00_3
+- pci_0000_05_09_0
+- pci_0000_05_09_1
+- pci_0000_05_09_2
+- pci_0000_05_09_3
+- pci_0000_05_09_4
+- pci_0000_05_09_5
+- pci_0000_05_09_6
+- pci_0000_05_09_7
+- pci_0000_2e_09_0
| | +- block_sdc_MTFDDAK480TBY_1AR1ZA_01PE061D7A09450LEN_1CC00A37
| | +- block_sdd_MTFDDAK480TBY_1AR1ZA_01PE061D7A09450LEN_1CFD6740
+- pci_0000_85_09_0
+- pci_0000_85_09_1
+- pci_0000_85_09_2
+- pci_0000_85_09_3
+- pci_0000_85_09_4
+- pci_0000_85_09_5
+- pci_0000_85_09_6
+- pci_0000_85_09_7
+- pci_0000_ae_09_0
將設備從宿主機分離出來
[root@kvm-02 ~]# virsh nodedev-dettach pci_0000_09_00_0
已分離設備 pci_0000_09_00_0
[root@kvm-02 ~]# virsh nodedev-dettach pci_0000_09_00_1
已分離設備 pci_0000_09_00_1
三、通過virsh edit修改虛擬機的xml配置,往虛擬機掛載PCI; 修改完虛擬機配置文件後,運行虛擬機.
比如virsh edit vm115_vnf,在devices裏新增hostdev配置(注意bus slot function的編號是和pci編號對應着來的)
<devices>
<hostdev mode='subsystem' type='pci' managed='yes'>
<source>
<address domain='0x000' bus='0x09' slot='0x00' function='0x0' />
</source>
</hostdev>
<hostdev mode='subsystem' type='pci' managed='yes'>
<source>
<address domain='0x000' bus='0x09' slot='0x00' function='0x1' />
</source>
</hostdev>
</devices>
修改完後保存退出,再 virsh start vm115_vnf
四、虛擬機裏lspci可以看到新掛載的PCI,給新的PCI增加驅動後,vpp即可顯示出相應的接口
如下:00:0d.0和00:12.0是新掛上的PCI
[root@vnf1-0 ~]# lspci
00:00.0 Host bridge: Intel Corporation 440FX - 82441FX PMC [Natoma] (rev 02)
00:01.0 ISA bridge: Intel Corporation 82371SB PIIX3 ISA [Natoma/Triton II]
00:01.1 IDE interface: Intel Corporation 82371SB PIIX3 IDE [Natoma/Triton II]
00:01.3 Bridge: Intel Corporation 82371AB/EB/MB PIIX4 ACPI (rev 03)
00:02.0 VGA compatible controller: Cirrus Logic GD 5446
00:03.0 Ethernet controller: Red Hat, Inc Virtio network device
00:04.0 Ethernet controller: Red Hat, Inc Virtio network device
00:05.0 Ethernet controller: Red Hat, Inc Virtio network device
00:06.0 Ethernet controller: Red Hat, Inc Virtio network device
00:07.0 Ethernet controller: Red Hat, Inc Virtio network device
00:08.0 Ethernet controller: Red Hat, Inc Virtio network device
00:09.0 Ethernet controller: Red Hat, Inc Virtio network device
00:0a.0 Ethernet controller: Red Hat, Inc Virtio network device
00:0b.0 Ethernet controller: Red Hat, Inc Virtio network device
00:0c.0 Ethernet controller: Red Hat, Inc Virtio network device
00:0d.0 Ethernet controller: Intel Corporation Device 37d3 (rev 09)
00:0e.0 USB controller: Intel Corporation 82801I (ICH9 Family) USB UHCI Controller #1 (rev 03)
00:0e.1 USB controller: Intel Corporation 82801I (ICH9 Family) USB UHCI Controller #2 (rev 03)
00:0e.2 USB controller: Intel Corporation 82801I (ICH9 Family) USB UHCI Controller #3 (rev 03)
00:0e.7 USB controller: Intel Corporation 82801I (ICH9 Family) USB2 EHCI Controller #1 (rev 03)
00:0f.0 Communication controller: Red Hat, Inc Virtio console
00:10.0 SCSI storage controller: Red Hat, Inc Virtio block device
00:11.0 Unclassified device [00ff]: Red Hat, Inc Virtio memory balloon
00:12.0 Ethernet controller: Intel Corporation Device 37d3 (rev 09)
給新掛的PCI增加igb_uio驅動後,dpdk可以識別出接口。從而vpp就能看到接口了
modprobe uio
insmod /home/dpdk-stable-18.02.2/x86_64-native-linuxapp-gcc/kmod/igb_uio.ko
/home/dpdk-stable-18.02.2/usertools/dpdk-devbind.py --bind=igb_uio 0000:00:12.0 0000:00:0d.0 0000:00:0c.0 0000:00:0b.0 0000:00:0a.0 0000:00:09.0 0000:00:08.0 0000:00:07.0 0000:00:06.0 0000:00:05.0 0000:00:04.0
五、virsh nodedev-reattach pci_0000_09_00_0命令可以重新加載被分離的PCI
[root@kvm-02 net]# virsh nodedev-reattach pci_0000_09_00_0
重新附加設備 pci_0000_09_00_0
[root@kvm-02 net]# virsh nodedev-reattach pci_0000_09_00_1
重新附加設備 pci_0000_09_00_1
配置SRIOV
1、linux的引導參數裏使能intel_iommu
intel_iommu=on
2、生成VF
linux內核3.8以下的和3.8以上的有區別。3.8以下版本先卸載驅動模塊,再重新拉起驅動模塊並附加max_vfs參數,網上較多這種配置方法指導。3.8以上的如下:
先查看內核版本
[root@kvm-02 rc.d]# cat /proc/version
Linux version 3.10.0-862.11.6.el7.x86_64 ([email protected]) (gcc version 4.8.5 20150623 (Red Hat 4.8.5-28) (GCC) ) #1 SMP Tue Aug 14 21:49:04 UTC 2018
[root@kvm-02 rc.d]#
通過ifconfig查看想生成VF的網卡名。
或者直接查看device文件
[root@kvm-02 ~]# cd /sys/class/net
[root@kvm-02 net]# ls
br0 br1 br10 br11 br12 br2 br3 br4 br5 br6 br7 br8 br9 eno1 eno2 eno3 eno4 enp0s20f0u1u6 enp134s0f0 enp134s0f1 enp175s0f0 enp175s0f1 enp47s0f0 enp47s0f1 enp49s0f0 enp49s0f1 enp88s0f0 enp88s0f1 lo virbr0 virbr0-nic
我需要使用的是網卡eno1和eno2, (4代表要生成4個VF,device最大支持多少VF可通過命令查看: cat /sys/class/net/device name/device/sriov_totalvfs)
echo 4 > /sys/class/net/eno1/device/sriov_numvfs
echo 4 > /sys/class/net/eno2/device/sriov_numvfs
查看pci,可以看到生成的VF(virtual function)
[root@kvm-02 rc.d]# lspci | grep Eth
09:00.0 Ethernet controller: Intel Corporation Ethernet Connection X722 for 10GbE SFP+ (rev 09)
09:00.1 Ethernet controller: Intel Corporation Ethernet Connection X722 for 10GbE SFP+ (rev 09)
09:00.2 Ethernet controller: Intel Corporation Ethernet Connection X722 for 10GbE SFP+ (rev 09)
09:00.3 Ethernet controller: Intel Corporation Ethernet Connection X722 for 10GbE SFP+ (rev 09)
09:02.0 Ethernet controller: Intel Corporation X722 Virtual Function (rev 09)
09:02.1 Ethernet controller: Intel Corporation X722 Virtual Function (rev 09)
09:02.2 Ethernet controller: Intel Corporation X722 Virtual Function (rev 09)
09:02.3 Ethernet controller: Intel Corporation X722 Virtual Function (rev 09)
09:06.0 Ethernet controller: Intel Corporation X722 Virtual Function (rev 09)
09:06.1 Ethernet controller: Intel Corporation X722 Virtual Function (rev 09)
09:06.2 Ethernet controller: Intel Corporation X722 Virtual Function (rev 09)
09:06.3 Ethernet controller: Intel Corporation X722 Virtual Function (rev 09)
2f:00.0 Ethernet controller: Intel Corporation Ethernet Controller X710 for 10GbE SFP+ (rev 01)
2f:00.1 Ethernet controller: Intel Corporation Ethernet Controller X710 for 10GbE SFP+ (rev 01)
31:00.0 Ethernet controller: Intel Corporation Ethernet Controller XXV710 for 25GbE SFP28 (rev 02)
31:00.1 Ethernet controller: Intel Corporation Ethernet Controller XXV710 for 25GbE SFP28 (rev 02)
58:00.0 Ethernet controller: Intel Corporation Ethernet Controller X710 for 10GbE SFP+ (rev 01)
58:00.1 Ethernet controller: Intel Corporation Ethernet Controller X710 for 10GbE SFP+ (rev 01)
86:00.0 Ethernet controller: Broadcom Limited NetXtreme BCM5720 Gigabit Ethernet PCIe
86:00.1 Ethernet controller: Broadcom Limited NetXtreme BCM5720 Gigabit Ethernet PCIe
af:00.0 Ethernet controller: Intel Corporation Ethernet Controller X710 for 10GbE SFP+ (rev 01)
af:00.1 Ethernet controller: Intel Corporation Ethernet Controller X710 for 10GbE SFP+ (rev 01)
[root@kvm-02 rc.d]#
三、HOST上給VF配置mac地址
[root@kvm-02 /]# ip link set eno1 vf 0 mac 00:A0:00:00:01:00
[root@kvm-02 /]# ip link set eno2 vf 0 mac 00:A0:00:00:02:00
[root@kvm-02 /]# ip link show eno1
3: eno1: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq master br1 portid 7cd30a5bdb98 state DOWN mode DEFAULT qlen 1000
link/ether 7c:d3:0a:5b:db:98 brd ff:ff:ff:ff:ff:ff
vf 0 MAC 00:a0:00:00:01:00, spoof checking on, link-state auto, trust off
vf 1 MAC 00:00:00:00:00:00, spoof checking on, link-state auto, trust off
vf 2 MAC 00:00:00:00:00:00, spoof checking on, link-state auto, trust off
vf 3 MAC 00:00:00:00:00:00, spoof checking on, link-state auto, trust off
[root@kvm-02 /]# ip link show eno2
5: eno2: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq master br2 portid 7cd30a5bdb99 state DOWN mode DEFAULT qlen 1000
link/ether 7c:d3:0a:5b:db:99 brd ff:ff:ff:ff:ff:ff
vf 0 MAC 00:a0:00:00:02:00, spoof checking on, link-state auto, trust off
vf 1 MAC 00:00:00:00:00:00, spoof checking on, link-state auto, trust off
vf 2 MAC 00:00:00:00:00:00, spoof checking on, link-state auto, trust off
vf 3 MAC 00:00:00:00:00:00, spoof checking on, link-state auto, trust off
四、將生成的VF從HOST分離出來
先查看有哪些PCI
[root@kvm-02 rc.d]# virsh nodedev-list --tree | grep 09
| +- pci_0000_09_00_0
| +- pci_0000_09_00_1
| +- pci_0000_09_00_2
| +- pci_0000_09_00_3
| +- pci_0000_09_02_0
| +- pci_0000_09_02_1
| +- pci_0000_09_02_2
| +- pci_0000_09_02_3
| +- pci_0000_09_06_0
| +- pci_0000_09_06_1
| +- pci_0000_09_06_2
| +- pci_0000_09_06_3
+- pci_0000_05_09_0
+- pci_0000_05_09_1
+- pci_0000_05_09_2
+- pci_0000_05_09_3
+- pci_0000_05_09_4
+- pci_0000_05_09_5
+- pci_0000_05_09_6
+- pci_0000_05_09_7
+- pci_0000_2e_09_0
| | +- block_sdc_MTFDDAK480TBY_1AR1ZA_01PE061D7A09450LEN_1CC00A37
| | +- block_sdd_MTFDDAK480TBY_1AR1ZA_01PE061D7A09450LEN_1CFD6740
+- pci_0000_85_09_0
+- pci_0000_85_09_1
+- pci_0000_85_09_2
+- pci_0000_85_09_3
+- pci_0000_85_09_4
+- pci_0000_85_09_5
+- pci_0000_85_09_6
+- pci_0000_85_09_7
+- pci_0000_ae_09_0
[root@kvm-02 rc.d]# virsh nodedev-dettach pci_0000_09_02_0
已分離設備 pci_0000_09_02_0
[root@kvm-02 rc.d]# virsh nodedev-dettach pci_0000_09_06_0
已分離設備 pci_0000_09_06_0
五、將分離出的VF 加入到虛擬機裏去,通過virsh edit命令給虛擬機的增加hostdev配置
<devices>
<hostdev mode='subsystem' type='pci' managed='yes'>
<source>
<address domain='0x000' bus='0x09' slot='0x02' function='0x0' />
</source>
</hostdev>
<hostdev mode='subsystem' type='pci' managed='yes'>
<source>
<address domain='0x000' bus='0x09' slot='0x06' function='0x0' />
</source>
</hostdev>
</devices>
六、啓動虛擬機,在虛擬機中查看新生成的PCI
[root@vnf1-0 ~]# lspci
00:00.0 Host bridge: Intel Corporation 440FX - 82441FX PMC [Natoma] (rev 02)
00:01.0 ISA bridge: Intel Corporation 82371SB PIIX3 ISA [Natoma/Triton II]
00:01.1 IDE interface: Intel Corporation 82371SB PIIX3 IDE [Natoma/Triton II]
00:01.3 Bridge: Intel Corporation 82371AB/EB/MB PIIX4 ACPI (rev 03)
00:02.0 VGA compatible controller: Cirrus Logic GD 5446
00:03.0 Ethernet controller: Red Hat, Inc Virtio network device
00:04.0 Ethernet controller: Red Hat, Inc Virtio network device
00:05.0 Ethernet controller: Red Hat, Inc Virtio network device
00:06.0 Ethernet controller: Red Hat, Inc Virtio network device
00:07.0 Ethernet controller: Red Hat, Inc Virtio network device
00:08.0 Ethernet controller: Red Hat, Inc Virtio network device
00:09.0 Ethernet controller: Red Hat, Inc Virtio network device
00:0a.0 Ethernet controller: Red Hat, Inc Virtio network device
00:0b.0 Ethernet controller: Red Hat, Inc Virtio network device
00:0c.0 Ethernet controller: Red Hat, Inc Virtio network device
00:0d.0 Ethernet controller: Intel Corporation Device 37cd (rev 09)
00:0e.0 USB controller: Intel Corporation 82801I (ICH9 Family) USB UHCI Controller #1 (rev 03)
00:0e.1 USB controller: Intel Corporation 82801I (ICH9 Family) USB UHCI Controller #2 (rev 03)
00:0e.2 USB controller: Intel Corporation 82801I (ICH9 Family) USB UHCI Controller #3 (rev 03)
00:0e.7 USB controller: Intel Corporation 82801I (ICH9 Family) USB2 EHCI Controller #1 (rev 03)
00:0f.0 Communication controller: Red Hat, Inc Virtio console
00:10.0 SCSI storage controller: Red Hat, Inc Virtio block device
00:11.0 Unclassified device [00ff]: Red Hat, Inc Virtio memory balloon
00:12.0 Ethernet controller: Intel Corporation Device 37cd (rev 09)
[root@vnf1-0 ~]#
七、給新的PCI增加driver,讓vpp能識別然後生成interface
modprobe uio
insmod /home/dpdk-stable-18.02.2/x86_64-native-linuxapp-gcc/kmod/igb_uio.ko
/home/dpdk-stable-18.02.2/usertools/dpdk-devbind.py --bind=igb_uio 0000:00:0d.0 0000:00:12.0
重啓vpp後,進入vpp的CLI,通過show interface可以查看到生成的VF接口。
[root@vnf1-0 ~]# systemctl restart vpp
[root@vnf1-0 ~]# vppctl
vpp# show interface
Name Idx State Counter Count
VirtualFunctionEthernet0/12/0 2 down
VirtualFunctionEthernet0/d/0 1 down
local0 0 down
vpp#
然後就可以利用VF開始進行愉快的測試工作啦!
附,問題:
爲了使永久生效。可以將生成VF的配置寫到rc.d裏頭去
[root@kvm-02 net]# cd /etc/rc.d
[root@kvm-02 rc.d]# touch /var/lock/subsys/local
[root@kvm-02 rc.d]# echo 4 > /sys/class/net/eno1/device/sriov_numvfs
[root@kvm-02 rc.d]# echo 4 > /sys/class/net/eno2/device/sriov_numvfs
但是試了沒有用,HOST重啓後仍然是沒有了VF,得重新配置。