云原生二十篇|Docker網(wǎng)絡(luò)篇
本文介紹Docker的網(wǎng)絡(luò),包括網(wǎng)橋,Overlay等。
第一部分:Docker網(wǎng)絡(luò)
Docker網(wǎng)絡(luò)需要處理容器之間,容器與外部網(wǎng)絡(luò)和VLAN之間的連接,設(shè)置之初相對復(fù)雜,隨著容器化的發(fā)展,Docker網(wǎng)絡(luò)架構(gòu)采用容器網(wǎng)絡(luò)模型方案(CNM),支持拔插式的驅(qū)動(dòng)方式來提供網(wǎng)絡(luò)拓?fù)洹?/p>
1、詳解(1)CNM
Docker的網(wǎng)絡(luò)架構(gòu)設(shè)計(jì)規(guī)范是CNM,CNM規(guī)定了基本組成要素:
沙盒:是一種獨(dú)立的網(wǎng)絡(luò)棧,包括以太網(wǎng)接口,端口,路由以及DNS配置
終端(EP):虛擬網(wǎng)絡(luò)接口,負(fù)責(zé)創(chuàng)建連接,將沙盒連接到網(wǎng)絡(luò)
網(wǎng)絡(luò):網(wǎng)橋的軟件實(shí)現(xiàn)
圖片
(2)Libnetwork
Libnetwork是CNM的標(biāo)準(zhǔn)實(shí)現(xiàn),支持跨平臺(tái),3個(gè)標(biāo)準(zhǔn)的組件和服務(wù)發(fā)現(xiàn),基于Ingress的容器負(fù)載均衡,以及網(wǎng)絡(luò)控制層和管理層的功能。
圖片
(3)網(wǎng)絡(luò)模式網(wǎng)橋(Bridge):Docker默認(rèn)的容器網(wǎng)絡(luò)驅(qū)動(dòng),容器通過一對veth pair連接到docker0網(wǎng)橋上,由Docker為容器動(dòng)態(tài)分配IP及配置路由、防火墻規(guī)則等,具體詳解可以查看第二部分;
Host:容器與主機(jī)共享同一Network Namespace,共享同一套網(wǎng)絡(luò)協(xié)議棧、路由表及iptables規(guī)則等,執(zhí)行docker run --net=host centos:7 python -m SimpleHTTPServer 8081,然后查看看網(wǎng)絡(luò)情況(netstat -tunpl) :
[root@VM-16-16-centos ~]# netstat -tunpl
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
tcp 0 0 0.0.0.0:8081 0.0.0.0:* LISTEN 1409899/python
可以看出host模型下,和主機(jī)上啟動(dòng)一個(gè)端口沒有差別,也不會(huì)做端口映射,所以不能啟動(dòng)的服務(wù)在主機(jī)端口范圍內(nèi)不能沖突;
Overlay:多機(jī)覆蓋網(wǎng)絡(luò)是Docker原生的跨主機(jī)多子網(wǎng)網(wǎng)絡(luò)方案,主要通過使用Linux bridge和vxlan隧道實(shí)現(xiàn),底層通過類似于etcd或consul的KV存儲(chǔ)系統(tǒng)實(shí)現(xiàn)多機(jī)的信息同步,具體詳解可以看第二部分;
Remote:Docker網(wǎng)絡(luò)插件的實(shí)現(xiàn),可以借助Libnetwork實(shí)現(xiàn)網(wǎng)絡(luò)自己的網(wǎng)絡(luò)插件;
None:模式是最簡單的網(wǎng)絡(luò)模式,它會(huì)使得Docker容器完全隔離,無法訪問外部網(wǎng)絡(luò)。在None模式下,容器不會(huì)被分配IP地址,也無法與其他容器和主機(jī)通信,可以嘗試執(zhí)行docker run --net=none centos:7 python -m SimpleHTTPServer 8081,然后curl xxx.com應(yīng)該是無法訪問的。
第二部分:網(wǎng)橋和Overlay詳解
Docker中最常用的兩種網(wǎng)絡(luò)是網(wǎng)橋和Overlay,網(wǎng)橋是解決主機(jī)內(nèi)多容器通訊,Overlay是解決跨主機(jī)多子網(wǎng)網(wǎng)絡(luò),下面我們來詳細(xì)了解一下這兩種網(wǎng)絡(luò)模式。
1、網(wǎng)橋(Bridge)
網(wǎng)橋是什么?同tap/tun、veth-pair一樣,網(wǎng)橋是一種虛擬網(wǎng)絡(luò)設(shè)備,所以具備虛擬網(wǎng)絡(luò)設(shè)備的所有特性,比如可以配置IP、MAC等,除此之外,網(wǎng)橋還是一個(gè)二層交換機(jī),具有交換機(jī)所有的功能。
(1)創(chuàng)建Docker daemon啟動(dòng)時(shí)會(huì)在主機(jī)創(chuàng)建一個(gè)Linux網(wǎng)橋(默認(rèn)為docker0),容器啟動(dòng)時(shí),Docker會(huì)創(chuàng)建一對veth-pair(虛擬網(wǎng)絡(luò)接口)設(shè)備,veth設(shè)備的特點(diǎn)是成對存在,從一端進(jìn)入的數(shù)據(jù)會(huì)同時(shí)出現(xiàn)在另一端,Docker會(huì)將一端掛載到docker0網(wǎng)橋上,另一端放入容器的Network Namespace內(nèi),從而實(shí)現(xiàn)容器與主機(jī)通信的目的。
圖片
(2)查看網(wǎng)橋執(zhí)行docker network ls,輸出:
[root@VM-16-16-centos ~]# docker network ls
NETWORK ID NAME DRIVER SCOPE
839c78d16e66 bridge bridge local
7865e8dc7489 host host local
e904b639a46d k3d-k3d-private bridge local
e6e4904ea322 none null local
(3)查看網(wǎng)橋的詳細(xì)信息先執(zhí)行docker run -d --name busybox-1 busybox echo "1"和docker run -d --name busybox-2 busybox echo "2",然后執(zhí)行docker inspect bridge,可以看到輸出網(wǎng)橋IPv4Address,MacAddress和EndpointID等:
"Containers": {
"bbd7d0775081dd9a9d026ca4c8e3ec2e1a4b19bead122eac94cd58f1fa118827": {
"Name": "busybox-2",
"EndpointID": "a82be8a01e25f5267fd6286c10eb1c72a1dd1c1933dcc84a82b286162767923c",
"MacAddress": "02:42:ac:11:00:03",
"IPv4Address": "172.17.0.3/16",
"IPv6Address": ""
},
"fa14fa3e167d17922a94153c0e0eb83e244ef7b20f9fc04d05db2589828e747c": {
"Name": "busybox-1",
"EndpointID": "90f614cc4b2e4c5d2baa75facfa8e493d287cbb9ae39edaecb3ec67915d2df2b",
"MacAddress": "02:42:ac:11:00:02",
"IPv4Address": "172.17.0.2/16",
"IPv6Address": ""
}
}
(4)探測網(wǎng)橋是否正常可以進(jìn)入busybox-2容器,執(zhí)行ping 172.17.0.2,輸出(可見是可以通的):
PING 172.17.0.2 (172.17.0.2): 56 data bytes
64 bytes from 172.17.0.2: seq=0 ttl=64 time=0.115 ms
64 bytes from 172.17.0.2: seq=1 ttl=64 time=0.079 ms
64 bytes from 172.17.0.2: seq=2 ttl=64 time=0.051 ms
64 bytes from 172.17.0.2: seq=3 ttl=64 time=0.066 ms
64 bytes from 172.17.0.2: seq=4 ttl=64 time=0.051 ms
^C
--- 172.17.0.2 ping statistics ---
5 packets transmitted, 5 packets received, 0% packet loss
round-trip min/avg/max = 0.051/0.072/0.115 ms
(5)端口映射基于上面我們已經(jīng)了解容器與容器之間的通訊,那么Docker端口映射是如何通訊的呢?先執(zhí)行 docker run -d -p 8000:8000 centos:7 python -m SimpleHTTPServer 建立映射關(guān)系,然后查看 iptables,執(zhí)行iptables -t nat -nvL:
[root@VM-16-16-centos ~]# iptables -t nat -nvL
Chain PREROUTING (policy ACCEPT 0 packets, 0 bytes)
pkts bytes target prot opt in out source destination
203K 7590K DOCKER all -- * * 0.0.0.0/0 0.0.0.0/0 ADDRTYPE match dst-type LOCAL
Chain INPUT (policy ACCEPT 0 packets, 0 bytes)
pkts bytes target prot opt in out source destination
Chain POSTROUTING (policy ACCEPT 0 packets, 0 bytes)
pkts bytes target prot opt in out source destination
0 0 MASQUERADE all -- * !docker0 172.17.0.0/16 0.0.0.0/0
26 1680 MASQUERADE all -- * !br-e904b639a46d 172.18.0.0/16 0.0.0.0/0
0 0 MASQUERADE tcp -- * * 172.18.0.2 172.18.0.2 tcp dpt:6443
0 0 MASQUERADE tcp -- * * 172.17.0.5 172.17.0.5 tcp dpt:8000
Chain OUTPUT (policy ACCEPT 0 packets, 0 bytes)
pkts bytes target prot opt in out source destination
0 0 DOCKER all -- * * 0.0.0.0/0 !127.0.0.0/8 ADDRTYPE match dst-type LOCAL
Chain DOCKER (2 references)
pkts bytes target prot opt in out source destination
0 0 RETURN all -- docker0 * 0.0.0.0/0 0.0.0.0/0
0 0 RETURN all -- br-e904b639a46d * 0.0.0.0/0 0.0.0.0/0
0 0 DNAT tcp -- !br-e904b639a46d * 0.0.0.0/0 0.0.0.0/0 tcp dpt:37721 to:172.18.0.2:6443
0 0 DNAT tcp -- !docker0 * 0.0.0.0/0 0.0.0.0/0 tcp dpt:8000 to:172.17.0.5:8000
可以看出只要是非docker0進(jìn)來的數(shù)據(jù)包(如eth0進(jìn)來的數(shù)據(jù)),都是8000直接轉(zhuǎn)到172.17.0.5:8000,可以看出這里是借助iptables實(shí)現(xiàn)的。
(6)網(wǎng)橋模式下的Docker網(wǎng)絡(luò)流程
- 容器與容器之前通訊是通過Network Namespace, bridge和veth pair這三個(gè)虛擬設(shè)備實(shí)現(xiàn)一個(gè)簡單的二層網(wǎng)絡(luò),不同的namespace實(shí)現(xiàn)了不同容器的網(wǎng)絡(luò)隔離讓他們分別有自己的ip,通過veth pair連接到docker0網(wǎng)橋上實(shí)現(xiàn)了容器間和宿主機(jī)的互通;
- 容器與外部或者主機(jī)通過端口映射通訊是借助iptables,通過路由轉(zhuǎn)發(fā)到docker0,容器通過查詢CAM表,或者UDP廣播獲得指定目標(biāo)地址的MAC地址,最后將數(shù)據(jù)包通過指定目標(biāo)地址的連接在docker0上的veth pair設(shè)備,發(fā)送到容器內(nèi)部的eth0網(wǎng)卡上;
- 容器與外部或者主機(jī)通過端口映射通訊對應(yīng)的限制是相同的端口不能在主機(jī)下重復(fù)映射;
2、Overlay
在云原生下集群通訊是必須的,當(dāng)然Docker提供多種方式,包括借助Macvlan接入VLAN網(wǎng)絡(luò),另一種是Overlay。那什么是Overlay呢?指的就是在物理網(wǎng)絡(luò)層上再搭建一層網(wǎng)絡(luò),通過某種技術(shù)再構(gòu)建一張相同的邏輯網(wǎng)絡(luò)。
(1)原理
VXLAN
在講原理之前先了解一下VXLAN網(wǎng)絡(luò),什么VXLAN網(wǎng)絡(luò)?VXLAN全稱是Visual eXtensible Local Area Network,本質(zhì)上是一種隧道封裝技術(shù),它使用封裝/解封裝技術(shù),將L2的以太網(wǎng)幀(Ethernet frames)封裝成L4的UDP數(shù)據(jù)報(bào)(datagrams),然后在L3的網(wǎng)絡(luò)中傳輸,效果就像L2的以太網(wǎng)幀在一個(gè)廣播域中傳輸一樣,實(shí)際上是跨越了L3網(wǎng)絡(luò),但卻感知不到L3網(wǎng)絡(luò)的存在。那么容器B發(fā)送請求給容器A(ping)的具體流程是怎樣的?
VXLAN
- 1.容器B執(zhí)行ping,流量通過BridgeA的veth接口發(fā)送出去,但是這個(gè)時(shí)候BridgeB并不知道要發(fā)送到哪里(BridgeB沒有MAC與容器A的IP映射表),所以BridgeB將通過VTEP解析ARP協(xié)議,確定MAC和IP以后,將真正的數(shù)據(jù)包轉(zhuǎn)發(fā)給VTEP,帶上VTEP的MAC地址
- 2.VTEP-B收到數(shù)據(jù)包,通過Swarm的集群的網(wǎng)絡(luò)信息中知道目標(biāo)IP是容器B
- 3.VTEP-B將數(shù)據(jù)包封裝為VXLAN格式(數(shù)據(jù)包中存儲(chǔ)了VXLAN的ID,記錄其映射關(guān)系)
- 4.實(shí)際底層VTEP-B將數(shù)據(jù)包通過主機(jī)B的UDP物理通道將VXLAN數(shù)據(jù)包封裝為UDP發(fā)送出去
- 5-6.通過隧道傳輸(UDP端口:4789),數(shù)據(jù)包到達(dá)VTEP-A,VTEP-A解析數(shù)據(jù)包讀取其中的VXLAN的ID,確定發(fā)送到哪個(gè)網(wǎng)橋
- 7.VTEP-A繼續(xù)解包和封包,將數(shù)據(jù)從UDP中拆解出來,重新組裝網(wǎng)絡(luò)協(xié)議包,發(fā)送給BridgeA
- 8.BridgeA收到數(shù)據(jù),通過veth發(fā)給容器A,回包的過程就是反向處理
(2)創(chuàng)建(Overlay)執(zhí)行docker swarm init,然后創(chuàng)建test-net(docker network create --subnet=10.1.1.0/24 --subnet=11.1.1.0/24 -d overlay test-net),查看網(wǎng)絡(luò)創(chuàng)建情況:
[root@VM-16-16-centos ~]# docker network ls
NETWORK ID NAME DRIVER SCOPE
839c78d16e66 bridge bridge local
d35cd7f611a6 docker_gwbridge bridge local
7865e8dc7489 host host local
kxda014niohv ingress overlay swarm
e904b639a46d k3d-k3d-private bridge local
e6e4904ea322 none null local
20miz5lia741 test-net overlay swarm
發(fā)現(xiàn)最后一行test-net創(chuàng)建成功。然后創(chuàng)建一個(gè)sevice,replicas等于2來看看網(wǎng)絡(luò)情況,執(zhí)行(docker service create --name test --network test-net --replicas 2 centos:7 sleep infinity),由于有兩臺(tái)物理機(jī)器,可以看看網(wǎng)絡(luò)和服務(wù)情況:
# 第一臺(tái)物理機(jī)器
[root@VM-16-16-centos ~]# docker ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
32e4ada62916 centos:7 "sleep infinity" 3 minutes ago Up 3 minutes test.2.5j5bm8m0g96enm3ltf7172rt4
[root@VM-16-16-centos ~]# docker network ls
NETWORK ID NAME DRIVER SCOPE
839c78d16e66 bridge bridge local
d35cd7f611a6 docker_gwbridge bridge local
7865e8dc7489 host host local
kxda014niohv ingress overlay swarm
e904b639a46d k3d-k3d-private bridge local
e6e4904ea322 none null local
20miz5lia741 test-net overlay swarm
# 第二臺(tái)物理機(jī)器
[root@VM-0-11-centos ~]# docker ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
a59a6f6dd333 centos@sha256:be65f488b7764ad3638f236b7b515b3678369a5124c47b8d32916d6487418ea4 "sleep infinity" 4 minutes ago Up 4 minutes test.1.braoj968z1jm5bc22e2k63he1
[root@VM-0-11-centos ~]# docker network ls
NETWORK ID NAME DRIVER SCOPE
d5d11ce155e2 bridge bridge local
f4c92d6c36ad docker_gwbridge bridge local
e6a370238ef2 host host local
828150052a2a mongodb_default bridge local
71347f42b9a6 none null local
20miz5lia741 test-net overlay swarm
(3)查看網(wǎng)絡(luò)詳情并測試創(chuàng)建成功后,可以查看一下網(wǎng)絡(luò)詳情,執(zhí)行docker network inspect test-net,輸出如下:
# 第一臺(tái)物理機(jī)
[
{
"Name": "test-net",
"Id": "20miz5lia7413mzkyhjokwu1h",
"Created": "2023-09-09T11:45:32.325811853+08:00",
"Scope": "swarm",
"Driver": "overlay",
"EnableIPv6": false,
"IPAM": {
"Driver": "default",
"Options": null,
"Config": [
{
"Subnet": "11.1.1.0/24",
"Gateway": "11.1.1.1"
},
{
"Subnet": "10.1.1.0/24",
"Gateway": "10.1.1.1"
}
]
},
"Internal": false,
"Attachable": false,
"Ingress": false,
"ConfigFrom": {
"Network": ""
},
"ConfigOnly": false,
"Containers": {
"32e4ada62916b7d1070ae3c01f4306959ca5d093d60956f827210ca875e932d9": {
"Name": "test.2.5j5bm8m0g96enm3ltf7172rt4",
"EndpointID": "3f8071a94d60c6efc5d3505e73c65abe5a282d291362ea3e3986d4b78505a41f",
"MacAddress": "02:42:0a:01:01:07",
"IPv4Address": "10.1.1.7/24",
"IPv6Address": ""
},
"lb-test-net": {
"Name": "test-net-endpoint",
"EndpointID": "0a20f5b5b756b8b50d319fa86fe870f5064fef22fc4583f2779f540718d22e4e",
"MacAddress": "02:42:0b:01:01:0e",
"IPv4Address": "11.1.1.14/24",
"IPv6Address": ""
}
},
"Options": {
"com.docker.network.driver.overlay.vxlanid_list": "4097,4098"
},
"Labels": {},
"Peers": [
{
"Name": "VM-0-11-centos-7305e151739f",
"IP": "172.27.0.11"
},
{
"Name": "2bced4fe04a3",
"IP": "172.27.16.16"
}
]
}
]
# 第二臺(tái)物理機(jī)
[
{
"Name": "test-net",
"Id": "20miz5lia7413mzkyhjokwu1h",
"Created": "2023-09-09T11:39:30.639389025+08:00",
"Scope": "swarm",
"Driver": "overlay",
"EnableIPv6": false,
"IPAM": {
"Driver": "default",
"Options": null,
"Config": [
{
"Subnet": "11.1.1.0/24",
"Gateway": "11.1.1.1"
},
{
"Subnet": "10.1.1.0/24",
"Gateway": "10.1.1.1"
}
]
},
"Internal": false,
"Attachable": false,
"Containers": {
"a59a6f6dd3330a618898548b147c901c1fb9c38d86ac2308e8a89de52bf60825": {
"Name": "test.1.braoj968z1jm5bc22e2k63he1",
"EndpointID": "b46b9f436dd04aa0effddf1a11b093589c9583a8f42086323dbc0d5bea28083e",
"MacAddress": "02:42:0b:01:01:0c",
"IPv4Address": "11.1.1.12/24",
"IPv6Address": ""
}
},
"Options": {
"com.docker.network.driver.overlay.vxlanid_list": "4097,4098"
},
"Labels": {},
"Peers": [
{
"Name": "VM-0-11-centos-7305e151739f",
"IP": "172.27.0.11"
},
{
"Name": "2bced4fe04a3",
"IP": "172.27.16.16"
}
]
}
]
可以看到兩個(gè)網(wǎng)絡(luò)地址(Containers的值)分別是10.1.1.7和11.1.1.12,然后登陸到第一臺(tái)機(jī)器(10.1.1.7),執(zhí)行ping 11.1.1.12 -c 1發(fā)現(xiàn)可以成功,那么繼續(xù)在第二臺(tái)(11.1.1.12)抓包看看輸出:
sh-4.2# tcpdump -i any
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on any, link-type LINUX_SLL (Linux cooked), capture size 262144 bytes
04:12:03.060571 IP test.2.5j5bm8m0g96enm3ltf7172rt4.test-net > a59a6f6dd333: ICMP echo request, id 10, seq 1, length 64
04:12:03.060600 IP a59a6f6dd333 > test.2.5j5bm8m0g96enm3ltf7172rt4.test-net: ICMP echo reply, id 10, seq 1, length 64
04:12:03.060966 IP localhost.48641 > 127.0.0.11.32849: UDP, length 39
04:12:03.061104 IP 127.0.0.11.domain > localhost.48641: 1204 1/0/0 PTR test.2.5j5bm8m0g96enm3ltf7172rt4.test-net. (115)
04:12:03.061219 IP localhost.60318 > 127.0.0.11.32849: UDP, length 41
04:12:03.061371 IP a59a6f6dd333.47196 > 183.60.82.98.domain: 32335+ PTR? 11.0.0.127.in-addr.arpa. (41)
04:12:03.061647 IP 183.60.82.98.domain > a59a6f6dd333.47196: 32335 NXDomain 0/1/0 (100)
04:12:03.061712 IP 127.0.0.11.domain > localhost.60318: 32335 NXDomain 0/1/0 (100)
04:12:03.062483 IP localhost.55382 > 127.0.0.11.32849: UDP, length 43
04:12:03.062616 IP a59a6f6dd333.60943 > 183.60.82.98.domain: 27860+ PTR? 98.82.60.183.in-addr.arpa. (43)
04:12:03.062783 IP 183.60.82.98.domain > a59a6f6dd333.60943: 27860 NXDomain 0/1/0 (107)
04:12:03.062830 IP 127.0.0.11.domain > localhost.55382: 27860 NXDomain 0/1/0 (107)
04:12:03.062996 IP localhost.35418 > 127.0.0.11.32849: UDP, length 41
04:12:03.063132 IP a59a6f6dd333.44914 > 183.60.82.98.domain: 62145+ PTR? 2.0.19.172.in-addr.arpa. (41)
04:12:03.063304 IP 183.60.82.98.domain > a59a6f6dd333.44914: 62145 NXDomain 0/1/0 (100)
從上述的抓包可以看出test.2.5j5bm8m0g96enm3ltf7172rt4.test-ne往當(dāng)前服務(wù)發(fā)送ICMP報(bào)文并成功響應(yīng)。
(4)驗(yàn)證VXLAN隧道傳輸數(shù)據(jù)為了第一節(jié)的原理:通過VXLAN隧道傳輸,于是抓包,先在10.1.1.12容器上啟動(dòng)python -m SimpleHTTPServer,然后在10.1.1.7上發(fā)送curl命令curl '11.1.1.12:8000',同時(shí)在10.1.1.12容器所在的主機(jī)2抓包udp端口4789,執(zhí)行tcpdump -i any port 4789,輸出如下:
[root@VM-0-11-centos ~]# tcpdump -i any port 4789
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on any, link-type LINUX_SLL (Linux cooked), capture size 262144 bytes
12:33:43.574034 IP 172.27.16.16.43908 > VM-0-11-centos.4789: VXLAN, flags [I] (0x08), vni 4097
IP 10.1.1.7.34786 > 11.1.1.12.irdmi: Flags [S], seq 2425839577, win 28200, options [mss 1410,sackOK,TS val 2424960080 ecr 0,nop,wscale 7], length 0
12:33:43.574142 IP VM-0-11-centos.49343 > 172.27.16.16.4789: VXLAN, flags [I] (0x08), vni 4098
IP 11.1.1.12.irdmi > 10.1.1.7.34786: Flags [S.], seq 841191230, ack 2425839578, win 27960, options [mss 1410,sackOK,TS val 3343949171 ecr 2424960080,nop,wscale 7], length 0
12:33:43.575033 IP 172.27.16.16.43908 > VM-0-11-centos.4789: VXLAN, flags [I] (0x08), vni 4097
IP 10.1.1.7.34786 > 11.1.1.12.irdmi: Flags [.], ack 1, win 221, options [nop,nop,TS val 2424960081 ecr 3343949171], length 0
12:33:43.575064 IP 172.27.16.16.43908 > VM-0-11-centos.4789: VXLAN, flags [I] (0x08), vni 4097
IP 10.1.1.7.34786 > 11.1.1.12.irdmi: Flags [P.], seq 1:79, ack 1, win 221, options [nop,nop,TS val 2424960081 ecr 3343949171], length 78
12:33:43.575084 IP VM-0-11-centos.49199 > 172.27.16.16.4789: VXLAN, flags [I] (0x08), vni 4098
IP 11.1.1.12.irdmi > 10.1.1.7.34786: Flags [.], ack 79, win 219, options [nop,nop,TS val 3343949172 ecr 2424960081], length 0
12:33:43.575732 IP VM-0-11-centos.49199 > 172.27.16.16.4789: VXLAN, flags [I] (0x08), vni 4098
IP 11.1.1.12.irdmi > 10.1.1.7.34786: Flags [P.], seq 1:18, ack 79, win 219, options [nop,nop,TS val 3343949173 ecr 2424960081], length 17
12:33:43.575822 IP VM-0-11-centos.49199 > 172.27.16.16.4789: VXLAN, flags [I] (0x08), vni 4098
IP 11.1.1.12.irdmi > 10.1.1.7.34786: Flags [FP.], seq 18:956, ack 79, win 219, options [nop,nop,TS val 3343949173 ecr 2424960081], length 938
12:33:43.576483 IP 172.27.16.16.43908 > VM-0-11-centos.4789: VXLAN, flags [I] (0x08), vni 4097
IP 10.1.1.7.34786 > 11.1.1.12.irdmi: Flags [.], ack 18, win 221, options [nop,nop,TS val 2424960083 ecr 3343949173], length 0
12:33:43.576555 IP 172.27.16.16.43908 > VM-0-11-centos.4789: VXLAN, flags [I] (0x08), vni 4097
IP 10.1.1.7.34786 > 11.1.1.12.irdmi: Flags [.], ack 957, win 235, options [nop,nop,TS val 2424960083 ecr 3343949173], length 0
12:33:43.576629 IP 172.27.16.16.43908 > VM-0-11-centos.4789: VXLAN, flags [I] (0x08), vni 4097
IP 10.1.1.7.34786 > 11.1.1.12.irdmi: Flags [F.], seq 79, ack 957, win 235, options [nop,nop,TS val 2424960083 ecr 3343949173], length 0
12:33:43.576645 IP VM-0-11-centos.49343 > 172.27.16.16.4789: VXLAN, flags [I] (0x08), vni 4098
IP 11.1.1.12.irdmi > 10.1.1.7.34786: Flags [.], ack 80, win 219, options [nop,nop,TS val 3343949174 ecr 2424960083], length 0
可以看出協(xié)議的確是從udp端口4789傳輸?shù)?,使用VXLAN。
第三部分:服務(wù)發(fā)現(xiàn)和Ingress
1、服務(wù)發(fā)現(xiàn)
Docker支持自定義配置DNS服務(wù)發(fā)現(xiàn),執(zhí)行docker run -it --name test1 --dns=8.8.8.8 --dns-search=dockercerts.com alpine sh,輸出:
[root@VM-16-16-centos ~]# docker run -it --name test1 --dns=8.8.8.8 --dns-search=dockercerts.com centos:7 sh
sh-4.2# cat /etc/resolv.conf
search dockercerts.com
nameserver 8.8.8.8
可以看出配置dns,實(shí)際是修改/etc/resolv.conf配置。
2、Ingress
對于集群,Docker Swarm提供類似K8S的Ingress模式,在Swarm集群內(nèi)的任何宿主機(jī)節(jié)點(diǎn)都可以訪問對應(yīng)的容器服務(wù),執(zhí)行樣例docker service create --name test --replicas 2 -p 5000:80 nginx,可以分別在Swarm集群的主機(jī)中看到對應(yīng)的端口5000,如下:
[root@VM-16-16-centos ~]# netstat -tunpl
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
tcp 0 0 0.0.0.0:22 0.0.0.0:* LISTEN 1042/sshd
tcp 0 0 0.0.0.0:37721 0.0.0.0:* LISTEN 2700/docker-proxy
tcp 0 0 0.0.0.0:8000 0.0.0.0:* LISTEN 1397687/docker-prox
tcp6 0 0 :::5000 :::* LISTEN 2380/dockerd
其底層是通過Sevice Mesh四層路由網(wǎng)絡(luò)實(shí)現(xiàn),原理和Docker本身端口映射類似,可以參考iptables -nvL端口查看,其中負(fù)載均衡的實(shí)現(xiàn)可以了解下圖。
圖片
參考
(1)https://zhuanlan.zhihu.com/p/558785823
(2)https://www.cnblogs.com/oscar2960/p/16536891.html
(3)https://www.jianshu.com/p/e3a87c76aab4?utm_campaign=maleskine&utm_cnotallow=note&utm_medium=seo_notes&utm_source=recommendation