OpenStack網絡實現--Linux虛擬網絡基礎

最近研究OpenStack，發現Neutron很有趣，在宿主機上執行ifconfig可以看到很多tap/br等網絡設備關鍵字，於是，不得不研究Linux虛擬網絡基礎。

tap

tap虛擬網絡設備，tap設備位於ISO的2層，數據鏈路層。

數據鏈路層的主要協議有：

點對點協議
以太網協議
高級數據鏈路協議
幀中繼
異步傳輸模式

但是tap只與其中的以太網協議對應。所以，tap也稱爲虛擬以太設備。

Linux使用tun模塊實現了tun/tap。

使用linux命令來操作tap：

檢查是否有tun模塊

# modinfo tun

filename:       /lib/modules/3.10.0-229.el7.x86_64/kernel/drivers/net/tun.ko

alias:          devname:net/tun

alias:          char-major-10-200

......

查看tun模塊是否加載

# lsmod | grep tun

tun                    27183  0

如果以上沒有輸出，即沒有加載tun，則使用以下命令加載tun模塊

# modprobe tun

tun模塊有了，還需要tunctl工具來操作tap/tun設備

# yum install tunctl -y

創建一個tap設備

# tunctl -t tap_test

Set 'tap_test' persistent and owned by uid 0

查看剛剛創建的tap設備

# ip link list

...

17: tap_test: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT qlen 500

link/ether 72:3f:cd:fd:0b:ee brd ff:ff:ff:ff:ff:ff

給tap設備綁定IP

# ip addr add local 192.168.12.0/24 dev tap_test

或者：

# ifconfig tap_test 192.168.12.0/24

查看是否成功綁定IP

# ifconfig -a

...

tap_test: flags=4098<BROADCAST,MULTICAST>  mtu 1500

        inet 192.168.12.0  netmask 255.255.255.0  broadcast 0.0.0.0

        ether 72:3f:cd:fd:0b:ee  txqueuelen 500  (Ethernet)

        RX packets 0  bytes 0 (0.0 B)

        RX errors 0  dropped 0  overruns 0  frame 0

        TX packets 0  bytes 0 (0.0 B)

        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

namespace

Namespace，即命名空間，主要是實現linux的資源隔離，我們這裏主要研究對網絡資源的隔離。

Linux操作網絡namespace的命令是ip netns。

查看命令幫助：

# ip netns help

Usage: ip netns list

       ip netns add NAME

       ip netns delete NAME

       ip netns identify PID

       ip netns pids NAME

       ip netns exec NAME cmd ...

       ip netns monitor

查看ns

# ip netns list

創建一個ns：名字是ns_test

# ip netns add ns_test

再查看

# ip netns list

ns_test

接下來，我們幫上面創建的tap設備tap_test遷移到這個ns裏邊去：

# ip link set tap_test netns ns_test

遷移成功後，原來的主機裏面，執行ip link list命令，就會發現這個tap_test已經消失了

操作namespace裏邊的設備

ip [all] netns exec [NAME] cmd ...     // cmd爲想要操作的命令行

在ns裏邊執行ip link list

# ip netns exec ns_test ip link list

1: lo: <LOOPBACK> mtu 65536 qdisc noop state DOWN mode DEFAULT

    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00

17: tap_test: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT qlen 500

    link/ether 72:3f:cd:fd:0b:ee brd ff:ff:ff:ff:ff:ff

在ns裏綁定IP地址

# ip netns exec ns_test ifconfig tap_test 192.168.12.1/24 up

查看IP地址

# ip netns exec ns_test ifconfig -a

...

tap_test: flags=4099<UP,BROADCAST,MULTICAST>  mtu 1500

        inet 192.168.12.1  netmask 255.255.255.0  broadcast 192.168.12.255

        ether 72:3f:cd:fd:0b:ee  txqueuelen 500  (Ethernet)

        RX packets 0  bytes 0 (0.0 B)

        RX errors 0  dropped 0  overruns 0  frame 0

        TX packets 0  bytes 0 (0.0 B)

        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

veth pair

veth pair不是一個設備，而是一對設備，用以連接兩個虛擬以太端口。

操作veth pair，需要跟namespace一起配合，否則就沒有意義。

設計一個測試用例，如下圖：

兩個namespace ns1/ns2中各有一個tap組成veth pair，兩者的IP地址如圖所示，兩個IP進行互ping測試。

具體實現：

創建veth pair

# ip link add tap1 type veth peer name tap2

創建namespace：ns1，ns2

# ip netns add ns1

# ip netns add ns2

把兩個tap分別遷移到對應的namespace中

# ip link set tap1 netns ns1

# ip link set tap2 netns ns2

分別給兩個tap綁定IP地址

# ip netns exec ns1 ip addr add local 192.168.12.1/24 dev tap1

# ip netns exec ns2 ip addr add local 192.168.12.2/24 dev tap2

將兩個tap設置爲UP

# ip netns exec ns1 ifconfig tap1 up

# ip netns exec ns2 ifconfig tap2 up

Ping測試

# ip netns exec ns2 ping 192.168.12.1

PING 192.168.12.1 (192.168.12.1) 56(84) bytes of data.

64 bytes from 192.168.12.1: icmp_seq=1 ttl=64 time=0.052 ms

...



# ip netns exec ns1 ping 192.168.12.2

PING 192.168.12.2 (192.168.12.2) 56(84) bytes of data.

64 bytes from 192.168.12.2: icmp_seq=1 ttl=64 time=0.206 ms

...

通過上面的測試用例，我們瞭解到通過veth pair連接兩個namespace的方法。

但是，如果是3個namespace之間需要互通呢？或者多個namespace之間需要互通呢？

veth pair只有一對tap，無法勝任，這時就需要用到Bridge/Switch。

Bridge

在Linux裏邊，bridge(網橋)和switch(交換機)都是實現2層的功能，概念相近，所以這裏也不做區分。

Linux實現Bridge功能的是brctl模塊。

安裝brctl

# yum install -y bridge-utils

先查看幫助

# brctl help

...

測試用例圖：

圖中，有4個namespace，每個ns都有一個tap與虛擬網橋vb上一個tap口組成veth pair。

這樣4個namespace就通過veth pair及bridge互聯起來。

具體實現：

創建veth pair

# ip link add tap1 type veth peer name tap1_peer

# ip link add tap2 type veth peer name tap2_peer

# ip link add tap3 type veth peer name tap3_peer

# ip link add tap4 type veth peer name tap4_peer

創建namespace

# ip netns add ns1

# ip netns add ns2

# ip netns add ns3

# ip netns add ns4

把tap設備遷移到對應的namespace中

# ip link set tap1 netns ns1

# ip link set tap2 netns ns2

# ip link set tap3 netns ns3

# ip link set tap4 netns ns4

創建bridge

# brctl addbr br1

把相應tap添加到bridge中

# brctl addif br1 tap1_peer

# brctl addif br1 tap2_peer

# brctl addif br1 tap3_peer

# brctl addif br1 tap4_peer

查看網橋

# brctl show

bridge name bridge id STP enabled interfaces

br1 8000.3601df1a8177 no tap1_peer

tap2_peer

tap3_peer

tap4_peer

配置相應tap的IP地址

# ip netns exec ns1 ip addr add local 192.168.12.1/24 dev tap1

# ip netns exec ns2 ip addr add local 192.168.12.2/24 dev tap2

# ip netns exec ns3 ip addr add local 192.168.12.3/24 dev tap3

# ip netns exec ns4 ip addr add local 192.168.12.4/24 dev tap4

將bridge及所有tap狀態設置爲tap

# ip link set br1 up

# ip link set tap1_peer up

# ip link set tap2_peer up

# ip link set tap3_peer up

# ip link set tap4_peer up

# ip netns exec ns1 ip link set tap1 up

# ip netns exec ns2 ip link set tap2 up

# ip netns exec ns3 ip link set tap3 up

# ip netns exec ns4 ip link set tap4 up

Ping測試

# ip netns exec ns1 ping 192.168.12.2

PING 192.168.12.2 (192.168.12.2) 56(84) bytes of data.

64 bytes from 192.168.12.2: icmp_seq=1 ttl=64 time=0.143 ms

...



# ip netns exec ns3 ping 192.168.12.1

PING 192.168.12.1 (192.168.12.1) 56(84) bytes of data.

64 bytes from 192.168.12.1: icmp_seq=1 ttl=64 time=0.134 ms

...

Router

Linux服務器本身就可以作爲路由器，只需要開啓轉發功能。

查看Linux轉發功能

# cat /proc/sys/net/ipv4/ip_forward

0

臨時開啓轉發

# echo "1" > /proc/sys/net/ipv4/ip_forward

再次查看

# cat /proc/sys/net/ipv4/ip_forward

1

測試用例圖：

圖中，ns1/tap1與ns2/tap2不在同一個網段中，中間需要經過一個路由器進行轉發才能互通。

圖中的router是一個示意，其實就是Linux開啓了路由轉發功能。

具體實現：

創建veth pair

# ip link add tap1 type veth peer name tap1_pair

# ip link add tap2 type veth peer name tap2_pair

創建namespace

# ip netns add ns1

# ip netns add ns2

將tap遷移到namespace

# ip link set tap1 netns ns1

# ip link set tap2 netns ns2

配置tap 的IP地址

# ip addr add local 192.168.12.1/24 dev tap1_pair

# ip addr add local 192.168.22.1/24 dev tap2_pair

# ip netns exec ns1 ip addr add local 192.168.12.2/24 dev tap1

# ip netns exec ns2 ip addr add local 192.168.22.2/24 dev tap2

將tap設置爲UP

# ip link set tap1_pair up

# ip link set tap2_pair up

# ip netns exec ns1 ip link set tap1 up

# ip netns exec ns2 ip link set tap2 up

查看路由表

# route -n

Kernel IP routing table

Destination     Gateway         Genmask         Flags Metric Ref    Use Iface

0.0.0.0         192.168.1.1     0.0.0.0         UG    100    0        0 eno50332208

192.168.12.0    0.0.0.0         255.255.255.0   U     0      0        0 tap1_pair

192.168.22.0    0.0.0.0         255.255.255.0   U     0      0        0 tap2_pair

可以看到，當我們添加了tap設備併爲其綁定了IP，Linux會自動生成直連路由。

Ping測試

# ip netns exec ns1 ping 192.168.22.2

connect: 網絡不可達

ping不通，我們再看一下ns1的路由表

# ip netns exec ns1 route -n

Kernel IP routing table

Destination     Gateway         Genmask         Flags Metric Ref    Use Iface

192.168.12.0    0.0.0.0         255.255.255.0   U     0      0        0 tap1

ns1中並沒有到達192.168.22.0/24的路由表項，需要我們手動添加。

ns1，ns2都添加靜態路由，分別到達對方的網段

# ip netns exec ns1 route add -net 192.168.22.0 netmask 255.255.255.0 gw 192.168.12.1

# ip netns exec ns2 route add -net 192.168.12.0 netmask 255.255.255.0 gw 192.168.22.1

再查看ns1的路由表

# ip netns exec ns1 route -n

Kernel IP routing table

Destination     Gateway         Genmask         Flags Metric Ref    Use Iface

192.168.12.0    0.0.0.0         255.255.255.0   U     0      0        0 tap1

192.168.22.0    192.168.12.1    255.255.255.0   UG    0      0        0 tap1

再次ping測試

# ip netns exec ns1 ping 192.168.22.2

PING 192.168.22.2 (192.168.22.2) 56(84) bytes of data.

64 bytes from 192.168.22.2: icmp_seq=1 ttl=63 time=0.364 ms

...



# ip netns exec ns2 ping 192.168.12.2

PING 192.168.12.2 (192.168.12.2) 56(84) bytes of data.

64 bytes from 192.168.12.2: icmp_seq=1 ttl=63 time=0.204 ms

...

tun

tun位於ISO的第3層，是一個點對點設備，它啓用了IP層隧道功能。

Linux原生支持的三層隧道，可以通過命令行ip tunnel help來查。

可以發現，Linux一共原生支持5種三層隧道，如下

隧道	簡述
ipip	IP in IP，在IPv4報文的基礎上再封裝一個IPv4報文頭，屬於IPv4 IN IPv4。
gre	通用路由封裝(Generic Routing Encapsulation)，定義了在任意一種網絡層協議上封裝任意一個其他網絡層協議的協議，屬於IPv4/IPv6 over IPv4。
sit	這個跟ipip類似，只不過是用一個IPv4的報文頭封裝IPv6的報文，屬於IPv6 over IPv4。
isatap	站內自動隧道尋址協議，一般用於IPv4網絡中的IPv6/IPv4節點間的通信。
vti	全稱是Virtual Tunnel Interface，爲IPsec隧道提供一個可路由的接口類型。

測試用例圖：

假設tap1與tap2能通，我們之前已經實現。

怎樣才能讓tun1與tun2互通呢？

具體實現：

這裏以ipip隧道爲例進行配置。

查看Linux系統是否已經加載ipip模塊

# lsmod | grep ipip

加載ipip模塊

# modprobe ipip

# lsmod | grep ipip

ipip                   13472  0

tunnel4                13252  1 ipip

ip_tunnel              23760  1 ipip

在ns1上創建tun1和ipip tunnel

# ip netns exec ns1 ip tunnel add tun1 mode ipip remote 192.168.22.2 local 192.168.12.2 ttl 255

# ip netns exec ns1 ip link set tun1 up

# ip netns exec ns1 ip addr add 192.168.88.8 peer 192.168.99.9 dev tun1

在ns2上創建tun2和ipip tunnel

# ip netns exec ns2 ip tunnel add tun2 mode ipip remote 192.168.12.2 local 192.168.22.2 ttl 255

# ip netns exec ns2 ip link set tun2 up

# ip netns exec ns2 ip addr add 192.168.99.9 peer 192.168.88.8 dev tun2

Ping測試（都不通，沒找到原因，有解決的朋友留言告知一下）

# ip netns exec ns1 ping 192.168.99.9

PING 192.168.99.9 (192.168.99.9) 56(84) bytes of data.

^C

--- 192.168.99.9 ping statistics ---

6 packets transmitted, 0 received, 100% packet loss, time 5030ms



# ip netns exec ns2 ping 192.168.88.8

PING 192.168.88.8 (192.168.88.8) 56(84) bytes of data.

^C

--- 192.168.88.8 ping statistics ---

4 packets transmitted, 0 received, 100% packet loss, time 3012ms

查看這個tun設備的信息

# ip netns exec ns1 ifconfig -a

...

tun1: flags=209<UP,POINTOPOINT,RUNNING,NOARP>  mtu 1480

        inet 192.168.88.8  netmask 255.255.255.255  destination 192.168.99.9

        tunnel   txqueuelen 0  (IPIP Tunnel)

        RX packets 0  bytes 0 (0.0 B)

        RX errors 0  dropped 0  overruns 0  frame 0

        TX packets 347  bytes 29148 (28.4 KiB)

        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

可以看到，tun1是一個ipip隧道的一個端點，IP是192.168.88.8，其對端IP是192.168.99.9。

再看看路由表

# ip netns exec ns1 route -n

Kernel IP routing table

Destination     Gateway         Genmask         Flags Metric Ref    Use Iface

192.168.12.0    0.0.0.0         255.255.255.0   U     0      0        0 tap1

192.168.22.0    192.168.12.1    255.255.255.0   UG    0      0        0 tap1

192.168.99.9    0.0.0.0         255.255.255.255 UH    0      0        0 tun1

路由表結果告訴我們，到達目的地192.168.99.9的路由的一個直連路由直接從tun1出去即可。

OpenStack網絡實現--Linux虛擬網絡基礎

tap

namespace

veth pair

Bridge

Router

tun

使用c#強大的表達式樹實現對象的深克隆之解決循環引用的問題

痞子衡嵌入式：恩智浦i.MX RT1xxx系列MCU啓動那些事（12.A）- uSDHC eMMC啓動時間(RT1170)

GPT-4o 引領人機交互新風向，向量數據庫賽道沸騰了

企業大模型如何成爲自己數據的“百科全書”？

本地SSL證書過期輸入命令在IIS自動生成

.NET週刊【5月第2期 2024-05-12】

Python列表生成式應用

Python列表生成式和字典生成式

OpenStack網絡實現--Linux虛擬網絡基礎

OpenStack 中如何應用 Host Aggregates 來更有效地分配硬件資源

淺談中國電信出口網絡的鏈路情況（什麼是 ChinaNet,CN2,GT,GIA）

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結