Openstack Vlan模式下的隔離和數(shù)據(jù)流向
一、隔離
計算機(jī)網(wǎng)絡(luò),是分層實現(xiàn)的,不同協(xié)議工作在不同層,按著OSI的分層模型,共有七個層,我們一般所說的隔離,通常指的是第2層,也叫“數(shù)據(jù)鏈路層”;數(shù)據(jù)鏈路層的網(wǎng)絡(luò)包,也叫“幀”,我們常說的網(wǎng)卡的MAC地址,就是幀的地址,MAC,其實是“媒體訪問控制”(media access control)的簡稱,這是數(shù)據(jù)鏈路層的一個子層。
為什么要在這個二層上做隔離呢?因為二層的幀,其中一些幀的地址是廣播地址,在同一個二層的設(shè)備都可以、也必須接收這些幀,交換機(jī)一般認(rèn)為工作在二層,對這些廣播包,也都要轉(zhuǎn)發(fā),所以二層通常被稱為一個“廣播域”。
二、VLAN
Openstack Neutorn的實現(xiàn)核心是二層物理網(wǎng)絡(luò)的抽象與管理,支持多種不同的網(wǎng)絡(luò)隔離技術(shù),以保障租戶tenant之間的網(wǎng)絡(luò)隔離,而VLAN就是主要使用的隔離方案,它本身就是交換機(jī)廣泛使用的二層隔離技術(shù);但這種方案也有一定的局限性,首先管理相對麻煩,需要配合設(shè)置物理交換機(jī),另外VLAN的可用數(shù)量也有限制,VLAN的ID號僅有四千多個,我們假設(shè)每個租戶分配1個VLAN,那最多也就能支持四千多個租戶。
三、虛擬網(wǎng)絡(luò)設(shè)備
圖1:vlan模式下計算節(jié)點的虛擬網(wǎng)絡(luò)拓?fù)浣Y(jié)構(gòu)圖
3.1 在vlan網(wǎng)絡(luò)模式下,計算節(jié)點上虛擬網(wǎng)絡(luò)設(shè)備如下:
(1)tapxxx設(shè)備
簡單的理解為為虛擬機(jī)提供的虛擬網(wǎng)卡,就是VM對應(yīng)的網(wǎng)口的vNIC。虛擬機(jī)的網(wǎng)絡(luò)功能由vNIC提供,Hypervisor可以為每個虛擬機(jī)創(chuàng)建一個或多個vNIC。
(2) qbrxxx設(shè)備
Linux網(wǎng)橋,簡單理解就是為安全組服務(wù),負(fù)責(zé)安全;因為不能在tap設(shè)備上配置network ACL rules,增加該linux bridge來實現(xiàn)iptable的安全組策略。
(3) qvmxxx設(shè)備
qvm主要是給從VM出來的包打上vlan tag。
(4) plyxxx設(shè)備
Ovs bridge,主要功能是實現(xiàn)過濾非本機(jī)MAC的單播報文。
(5) pvixxx和pvoxxx設(shè)備
ply是策略網(wǎng)橋,ply與br-int之間由一對path port連接,連接ply的一端為pvi端口,連接br-int的一端為pvo端口。
(6) br-int設(shè)備
br-int集成網(wǎng)橋,主要是幀的轉(zhuǎn)發(fā)功能。
(7) int-brcps和phy-brcps設(shè)備
主要負(fù)責(zé)將br-int轉(zhuǎn)發(fā)的幀中的vlanid轉(zhuǎn)換。
(8) brcps和trunk0設(shè)備
br-cps是Ovs bridge ,Trunk0由eth0和eth1組成的bond(active-backup模式),packets要想進(jìn)入physical network,還得通過真正的物理網(wǎng)卡trunk0(eth2和eth4),所以將trunk0橋接到br-1上來打通整個鏈路。
(9) tapuuu設(shè)備
DHCP服務(wù)監(jiān)聽端口。
3.2 計算節(jié)點上的網(wǎng)絡(luò)設(shè)備信息
3.2.1 Linux bridage信息
- Compute153:~ # virsh list
- Id Name State
- ----------------------------------------------------
- 1 instance-00000583 running
- 2 instance-000005df running
- 3 instance-00000603 running
- 4 instance-00000654 running
- 5 instance-0000068f running
- 6 instance-000006d7 running
- 7 instance-0000070d running
- 9 instance-00000769 running
- 10 instance-0000090d running
- 11 instance-00000a37 running
計算節(jié)點Compute153上啟動了10個虛擬機(jī)。
- Compute153
- :~
- # brctl show
- bridge name bridge id STP enabled interfaces
- qbr7fc1e7d0-0c 8000.bee6e69f9457 no qvm7fc1e7d0-0c
- tap7fc1e7d0-0c
- qbr931641ad-4b 8000.eaa1a27fffcb no qvm931641ad-4b
- tap931641ad-4b
- qbr963c4b38-70 8000.7635674ec1fc no qvm963c4b38-70
- tap963c4b38-70
- qbr9df6f9f9-42 8000.2e1eba67aca5 no qvm9df6f9f9-42
- tap9df6f9f9-42
- qbrb9dd9478-0f 8000.2e954943421c no qvmb9dd9478-0f
- tapb9dd9478-0f
- qbrc24f2999-b9 8000.427df7c7a333 no qvmc24f2999-b9
- tapc24f2999-b9
- qbrc3833757-af 8000.7e6eb025950b no qvmc3833757-af
- tapc3833757-af
- qbrc78917be-9c 8000.1a67a8814d03 no qvmc78917be-9c
- tapc78917b0-9c
- qbrd5cbf3b0-ef 8000.f6de8391f526 no qvmd5cbf3b0-ef
- tapd5cbf3b0-ef
- qbrfe79631b-85 8000.c2d425903a69 no qvmfe79631b-85
- tapfe79631b-85
可以看到有10個qbr,每一個虛擬機(jī)的每一張網(wǎng)卡都有對應(yīng)的qbr。每一個qbr都有對應(yīng)的 tap和qvm,對應(yīng)圖上的qbr北向與南向接口。
3.2.2 ovs bridage信息
通過ovs-vsctl可以查詢主機(jī)上已有的 OVS bridge及其中的 port。
- Bridge "plyc24f2999-b9"
- Port "qvmc24f2999-b9"
- Interface"qvmc24f2999-b9"
- type: internal
- Port "plyc24f2999-b9"
- Interface"plyc24f2999-b9"
- type: internal
- Port "pvic24f2999-b9"
- Interface"pvic24f2999-b9"
- type: patch
- options: {peer="pvoc24f2999-b9"}
可以看到ply網(wǎng)橋信息,上面對接qvm,下面對接br-int,ply與br-int之間是一對port接口, pvi與pvo接口。
- Bridge br-int
- fail_mode: secure
- Port "pvoc78917be-9c"
- tag: 5
- Interface"pvoc78917be-9c"
- type: patch
- options:{peer="pvic78917be-9c"}
- Port "pvob9dd9478-0f"
- tag: 9
- Interface"pvob9dd9478-0f"
- type: patch
- options:{peer="pvib9dd9478-0f"}
注意pov端口是有tag的,這是一個內(nèi)部的tag,主要是為了區(qū)分同一虛擬機(jī)的不同虛擬網(wǎng) 卡 設(shè)備,會將多張網(wǎng)卡依次編號。
- Bridge br-int
- fail_mode: secure
- Port "pvoc78917be-9c"
- tag: 5
- Interface"pvoc78917be-9c"
- type: patch
- options:{peer="pvic78917be-9c"}
- Port "pvob9dd9478-0f"
- tag: 9
- Interface"pvob9dd9478-0f"
- type: patch
- options:{peer="pvib9dd9478-0f"}
- Port br-int
- tag: 4095
- Interface br-int
- type: internal
- Port int-brcps
- Interface int-brcps
- type: patch
- options: {peer=phy-brcps}
- Port "pvo9df6f9f9-42"
- tag: 5
- Interface"pvo9df6f9f9-42"
- type: patch
- options:{peer="pvi9df6f9f9-42"}
可以看到br-int網(wǎng)橋上的所有端口信息,向上是pvo口,向下是int-brcps口,是用來連接 brcps網(wǎng)橋。
- Bridge br-int
- fail_mode: secure
- Port "pvoc78917be-9c"
- tag: 5
- Interface"pvoc78917be-9c"
- type: patch
- options:{peer="pvic78917be-9c"}
- Port int-brcps
- Interface int-brcps
- type: patch
- options: {peer=phy-brcps}
int-brcps和phy-brcps接口是br-int與brcps網(wǎng)橋相連的接口,查詢br-int是可以找到與之相連的int-brcps,查詢brcps網(wǎng)橋可以找到與之相連的phy-brcps接口。
- Bridge brcps
- Port external_om
- tag:1405
- Interface external_om
- type: internal
- Port "trunk0"
- Interface "trunk0"
- Port phy-brcps
- Interface phy-brcps
- type: patch
- options: {peer=int-brcps}
- Port external_api
- tag:1400
- Interface external_api
- type: internal
- Port brcps
- tag:0
- Interface brcps
- type: internal
- Port "om-physnet1"
- tag:1089
- Interface "om-physnet1"
- type: internal
管理面網(wǎng)橋brcps上有多個端口,用于與外部通信打通的external_om與external_api端口,帶tag;向上的phy-brcps接口;本地接口brcps,以及端口om-physnet1,還有最重要的trunk0,實際的數(shù)據(jù)物理通信接口。
以上就是vlan模式虛擬機(jī)通信需要經(jīng)過的所有端口,數(shù)據(jù)流向如下:
1)數(shù)據(jù)幀從VM出來,經(jīng)過TAP提供的虛擬網(wǎng)口vNIC,再經(jīng)過Linux網(wǎng)橋qrb安全驗證,走到qvm,會打上一個內(nèi)部的VLANtag,成為主機(jī)節(jié)點內(nèi)部的local id。這個id的作用是區(qū)分同一個主機(jī)內(nèi)部的不同VM;
2)繼續(xù)南下到ply,資料顯示作用是過濾掉非本機(jī)的MAC,主要作用是為了方便訪問同一個主機(jī)的其他VM,如果目的源是同一臺主機(jī)則直接訪問,不用br-int轉(zhuǎn)發(fā);
3)繼續(xù)走到br-int會實現(xiàn)轉(zhuǎn)發(fā)到目的幀主機(jī),南下到patch port,port會把幀中間的之前打的內(nèi)部VLAN ID即local id刪除,換成外部的VLAN ID;
4)之后幀走到實際的外部物理交換機(jī)網(wǎng)口,發(fā)送到目的地。
3.2.2 br-int dump-flows信息
br-int完成從brcps上過來流量(從口int-brcps到達(dá))的vlan tag轉(zhuǎn)換,下面例子可以看到從外部VLAN ID:1013轉(zhuǎn)換為內(nèi)部VLAN ID:2。
- Compute153
- :~
- # ovs-ofctl dump-flows br-int
- NXST_FLOW reply (xid=0x4):
- cookie=0xaf3ffaad56834ff8,duration=8767986.702s, table=0, n_packets=138635866, n_bytes=49130127982,idle_age=0, hard_age=65534, priority=3,in_port=1,dl_vlan=1013actions=mod_vlan_vid:2,NORMAL
- cookie=0xaf3ffaad56834ff8,duration=8759690.249s, table=0, n_packets=902894466, n_bytes=111008267998,idle_age=0, hard_age=65534, priority=3,in_port=1,dl_vlan=1014actions=mod_vlan_vid:3,NORMAL
- cookie=0xaf3ffaad56834ff8,duration=8606291.966s, table=0, n_packets=75523546, n_bytes=7721353259,idle_age=0, hard_age=65534, priority=3,in_port=1,dl_vlan=503actions=mod_vlan_vid:4,NORMAL
- cookie=0xaf3ffaad56834ff8,duration=7943259.828s, table=0, n_packets=27312770, n_bytes=4039091682,idle_age=0, hard_age=65534, priority=3,in_port=1,dl_vlan=1011actions=mod_vlan_vid:5,NORMAL
- cookie=0xaf3ffaad56834ff8,duration=7248098.099s, table=0, n_packets=17132221, n_bytes=1590164809,idle_age=0, hard_age=65534, priority=3,in_port=1,dl_vlan=504actions=mod_vlan_vid:6,NORMAL
- cookie=0xaf3ffaad56834ff8,duration=5730798.970s, table=0, n_packets=35859018, n_bytes=4389953008,idle_age=0, hard_age=65534, priority=3,in_port=1,dl_vlan=1012actions=mod_vlan_vid:7,NORMAL
- cookie=0xaf3ffaad56834ff8,duration=583874.187s, table=0, n_packets=2041814, n_bytes=433205117,idle_age=0, hard_age=65534, priority=3,in_port=1,dl_vlan=1015actions=mod_vlan_vid:8,NORMAL
- cookie=0xaf3ffaad56834ff8,duration=146306.053s, table=0, n_packets=435169, n_bytes=31391505, idle_age=0,hard_age=65534, priority=3,in_port=1,dl_vlan=1016 actions=mod_vlan_vid:9,NORMAL
- cookie=0xaf3ffaad56834ff8,duration=9233017.369s, table=0, n_packets=8966890076, n_bytes=2799828872226,idle_age=0, hard_age=65534, priority=2,in_port=1 actions=drop
- cookie=0xaf3ffaad56834ff8, duration=9233016.732s,table=0, n_packets=1106708092, n_bytes=190560627712, idle_age=0,hard_age=65534, priority=0 actions=NORMAL
- cookie=0xaf3ffaad56834ff8,duration=9233019.667s, table=23, n_packets=0, n_bytes=0, idle_age=65534,hard_age=65534, priority=0 actions=drop
- cookie=0xaf3ffaad56834ff8,duration=9233019.551s, table=24, n_packets=0, n_bytes=0, idle_age=65534,hard_age=65534, priority=0 actions=drop
3.2.3 brcps dump-flows信息
brcps上負(fù)責(zé)從br-int上過來的流量(從口phy-brcps到達(dá)),實現(xiàn)local vlan到外部vlan的轉(zhuǎn)換,下面例子可以看到從內(nèi)部VLAN ID:2轉(zhuǎn)換為外部VLAN ID:1013。
- Compute153
- :~
- # ovs-ofctl dump-flows brcps
- NXST_FLOW reply (xid=0x4):
- cookie=0xaaf94399aad7707e,duration=8768079.505s, table=0, n_packets=4610859, n_bytes=723908441,idle_age=6, hard_age=65534, priority=4,in_port=5,dl_vlan=2actions=mod_vlan_vid:1013,NORMAL
- cookie=0xaaf94399aad7707e,duration=8759783.046s, table=0, n_packets=1061625441, n_bytes=180117774176,idle_age=0, hard_age=65534, priority=4,in_port=5,dl_vlan=3actions=mod_vlan_vid:1014,NORMAL
- cookie=0xaaf94399aad7707e,duration=8606384.765s, table=0, n_packets=12135266, n_bytes=3806123480,idle_age=32, hard_age=65534, priority=4,in_port=5,dl_vlan=4actions=mod_vlan_vid:503,NORMAL
- cookie=0xaaf94399aad7707e,duration=7943352.621s, table=0, n_packets=8783552, n_bytes=1513703385,idle_age=0, hard_age=65534, priority=4,in_port=5,dl_vlan=5actions=mod_vlan_vid:1011,NORMAL
- cookie=0xaaf94399aad7707e,duration=7248190.902s, table=0, n_packets=2559355, n_bytes=510785011,idle_age=16, hard_age=65534, priority=4,in_port=5,dl_vlan=6actions=mod_vlan_vid:504,NORMAL
- cookie=0xaaf94399aad7707e,duration=5730891.771s, table=0, n_packets=16831749, n_bytes=3864947698,idle_age=0, hard_age=65534, priority=4,in_port=5,dl_vlan=7actions=mod_vlan_vid:1012,NORMAL
- cookie=0xaaf94399aad7707e,duration=583966.979s, table=0, n_packets=169878, n_bytes=24055409, idle_age=29,hard_age=65534, priority=4,in_port=5,dl_vlan=8 actions=mod_vlan_vid:1015,NORMAL
- cookie=0xaaf94399aad7707e,duration=146398.874s, table=0, n_packets=1541, n_bytes=157171, idle_age=132,hard_age=65534, priority=4,in_port=5,dl_vlan=9 actions=mod_vlan_vid:1016,NORMAL
- cookie=0xaaf94399aad7707e,duration=9233110.012s, table=0, n_packets=78, n_bytes=6780, idle_age=65534,hard_age=65534, priority=2,in_port=5 actions=drop
- cookie=0xaaf94399aad7707e,duration=9233111.393s, table=0, n_packets=10761364888, n_bytes=3180091314185,idle_age=0, hard_age=65534, priority=0 actions=
3.2.4 iptables安全組
每一個虛擬機(jī)的tap都對應(yīng)2個chain表(out和in),dhcp agent到虛擬機(jī)的訪問策略定義在out表;
- Compute153
- :~
- # iptables -Lneutron-openvswi-sg-chain
- Chain
- neutron-openvswi-sg-chain (
- 20
- references)
- target prot opt source destination
- neutron-openvswi-i7fc1e7d0-0 all -- anywhere anywhere PHYSDEV match --physdev-outtap7fc1e7d0-0c --physdev-is-bridged /* Jump to the VM specific chain. */
- neutron-openvswi-o7fc1e7d0-0 all -- anywhere anywhere PHYSDEV match --physdev-intap7fc1e7d0-0c --physdev-is-bridged /* Jump to the VM specific chain. */
- neutron-openvswi-i931641ad-4 all -- anywhere anywhere PHYSDEV match --physdev-outtap931641ad-4b --physdev-is-bridged /* Jump to the VM specific chain. */
- neutron-openvswi-o931641ad-4 all -- anywhere anywhere PHYSDEV match --physdev-intap931641ad-4b --physdev-is-bridged /* Jump to the VM specific chain. */
- neutron-openvswi-i963c4b38-7 all -- anywhere anywhere PHYSDEV match --physdev-outtap963c4b38-70 --physdev-is-bridged /* Jump to the VM specific chain. */
- neutron-openvswi-o963c4b38-7 all -- anywhere anywhere PHYSDEV match --physdev-intap963c4b38-70 --physdev-is-bridged /* Jump to the VM specific chain. */
- neutron-openvswi-i9df6f9f9-4 all -- anywhere anywhere PHYSDEV match --physdev-outtap9df6f9f9-42 --physdev-is-bridged /* Jump to the VM specific chain. */
- neutron-openvswi-o9df6f9f9-4 all -- anywhere anywhere PHYSDEV match --physdev-intap9df6f9f9-42 --physdev-is-bridged /* Jump to the VM specific chain. */
- neutron-openvswi-ib9dd9478-0 all -- anywhere anywhere PHYSDEV match --physdev-outtapb9dd9478-0f --physdev-is-bridged /* Jump to the VM specific chain. */
- neutron-openvswi-ob9dd9478-0 all -- anywhere anywhere PHYSDEV match --physdev-intapb9dd9478-0f --physdev-is-bridged /* Jump to the VM specific chain. */
- neutron-openvswi-ic24f2999-b all -- anywhere anywhere PHYSDEV match --physdev-outtapc24f2999-b9 --physdev-is-bridged /* Jump to the VM specific chain. */
- neutron-openvswi-oc24f2999-b all -- anywhere anywhere PHYSDEV match --physdev-intapc24f2999-b9 --physdev-is-bridged /* Jump to the VM specific chain. */
- neutron-openvswi-ic3833757-a all -- anywhere anywhere PHYSDEV match --physdev-outtapc3833757-af --physdev-is-bridged /* Jump to the VM specific chain. */
- neutron-openvswi-oc3833757-a all -- anywhere anywhere PHYSDEV match --physdev-intapc3833757-af --physdev-is-bridged /* Jump to the VM specific chain. */
- neutron-openvswi-ic78917be-9 all -- anywhere anywhere PHYSDEV match --physdev-outtapc78917be-9c --physdev-is-bridged /* Jump to the VM specific chain. */
- neutron-openvswi-oc78917be-9 all -- anywhere anywhere PHYSDEV match --physdev-intapc78917be-9c --physdev-is-bridged /* Jump to the VM specific chain. */
- neutron-openvswi-id5cbf3b0-e all -- anywhere anywhere PHYSDEV match --physdev-outtapd5cbf3b0-ef --physdev-is-bridged /* Jump to the VM specific chain. */
- neutron-openvswi-od5cbf3b0-e all -- anywhere anywhere PHYSDEV match --physdev-intapd5cbf3b0-ef --physdev-is-bridged /* Jump to the VM specific chain. */
- neutron-openvswi-ife79631b-8 all -- anywhere anywhere PHYSDEV match --physdev-outtapfe79631b-85 --physdev-is-bridged /* Jump to the VM specific chain. */
- neutron-openvswi-ofe79631b-8 all -- anywhere anywhere PHYSDEV match --physdev-intapfe79631b-85 --physdev-is-bridged /* Jump to the VM specific chain. */
- ACCEPT all -- anywhere anywhere
- Compute153
- :~
- # iptables -L neutron-openvswi-oc78917be-9
- Chain
- neutron-openvswi-oc78917be-
- 9
- (
- 2
- references)
- target prot opt source destination
- RETURN udp -- default 255.255.255.255 udpspt:bootpc dpt:bootps /* Allow DHCP client traffic. */
- neutron-openvswi-sc78917be-9 all -- anywhere anywhere
- RETURN udp -- anywhere anywhere udp spt:bootpc dpt:bootps /* AllowDHCP client traffic. */
- DROP udp -- anywhere anywhere udp spt:bootps udp dpt:bootpc /*Prevent DHCP Spoofing by VM. */
- RETURN all -- anywhere anywhere state RELATED,ESTABLISHED /*Direct packets associated with a known session to the RETURN chain. */
- RETURN all -- anywhere anywhere
- DROP all -- anywhere anywhere state INVALID /* Drop packets thatappear related to an existing connection (e.g. TCP ACK/FIN) but do not have anentry in conntrack. */
- neutron-openvswi-sg-fallback all -- anywhere anywhere
- /*
- Send
- unmatched traffic to thefallback chain. */
- Compute153
- :~
- # iptables -Lneutron-openvswi-ic78917be-9
- Chain
- neutron-openvswi-ic78917be-
- 9
- (
- 1
- references)
- target prot opt source destination
- RETURN all -- anywhere anywhere state RELATED,ESTABLISHED /*Direct packets associated with a known session to the RETURN chain. */
- RETURN udp -- xxx.xxx.xxx.xxx anywhere udp spt:bootps udp dpt:bootpc ---DHCP
- RETURN udp -- xxx.xxx.xxx.xxx anywhere udp spt:bootps udp dpt:bootpc ---DHCP
- RETURN all -- anywhere anywhere
- DROP all -- anywhere anywhere state INVALID /* Drop packets thatappear related to an existing connection (e.g. TCP ACK/FIN) but do not have anentry in conntrack. */
- neutron-openvswi-sg-fallback all -- anywhere anywhere /* Send unmatched traffic to thefallback chain. */
3.3 計算節(jié)點的vlan隔離
在vlan模式下,每個 vlan network 都有自己的 bridge,從而也就實現(xiàn)了基于 vlan 的隔離,vlan tag的轉(zhuǎn)換需要在br-int和brcps兩個網(wǎng)橋上進(jìn)行相互配合。br-int負(fù)責(zé)從int-brcps過來的包(帶外部vlan)轉(zhuǎn)換為內(nèi)部vlan,而brcps負(fù)責(zé)從phy-brcps過來的包(帶內(nèi)部vlan)轉(zhuǎn)化為外部的vlan。租戶的流量隔離也是通過vlan來進(jìn)行的,因此包括兩種vlan,虛擬機(jī)在Compute Node內(nèi)流量帶有的local vlan和在Compute Node之外物理網(wǎng)絡(luò)上隔離不同租戶的vlan。物理交換機(jī)與eth網(wǎng)卡相連的 port 設(shè)置成 trunk 模式,實現(xiàn)同一塊物理網(wǎng)卡上面通過多個不同vlan 的數(shù)據(jù)。