Cumulus VX で VXLAN+EVPN (original : 2017/03/22)
この記事は某所で 2017/03/22
に書いた記事のコピーです。
そのため 2017/05/11
時点ではやや古い情報も含まれています。(2017/05 に GNS3 v2.0.0 stable
や Cumulus Linux v3.3
がリリースされた)
- 最初に
- 構築
- 動作確認
- おしまい
最初に
本項でやること
以下をやります。
- Cumulus Linux の Early Access 版(
2017/03/21
時点)で限定的に VXLAN+EVPN 機能を試行できるので、仮想版である Cumulus VX でも動くか見る- 将来的に本実装された際には、設定方法や挙動は変わる筈
- 現在取得できる EA 版は Quagga daemon のみなので、EVPN機能周りの設定や参照は Quagga にて
- EVPN Multihoming を実装していない代わりに、MLAGでVTEPを冗長化する仕組みがあるようなので、その設定と挙動を見る
環境情報
Cumulus VX
Cumulus公式 / Download Cumulus VX で 2017/03/13
時点でダウンロード可能な最新版(Cumulus VX 3.2.1
)の KVM 版
アカウントを作れば、個人でも特に問題なくダウンロードできました。
kotetsu@kvm01:~/vm_images/qemu$ ls -al cumulus-linux-3.2.1-vx-amd64-1486153138.ac46c24zd00d13e.qcow2 -rw-r--r-- 1 kotetsu kotetsu 1232601088 Mar 7 22:11 cumulus-linux-3.2.1-vx-amd64-1486153138.ac46c24zd00d13e.qcow2 kotetsu@kvm01:~/vm_images/qemu$ sha1sum cumulus-linux-3.2.1-vx-amd64-1486153138.ac46c24zd00d13e.qcow2 3d782f2c450683b4da5ea2324c88f3dccb89b6c2 cumulus-linux-3.2.1-vx-amd64-1486153138.ac46c24zd00d13e.qcow2
kotetsu@bb03:~$ cat /etc/lsb-release DISTRIB_ID="Cumulus Linux" DISTRIB_RELEASE=3.2.1 DISTRIB_DESCRIPTION="Cumulus Linux 3.2.1" kotetsu@bb03:~$ uname -a Linux bb03 4.1.0-cl-4-amd64 #1 SMP Debian 4.1.33-1+cl3u7 (2017-01-26) x86_64 GNU/Linux
その他
$ uname -a Linux kvm01 4.4.0-57-generic #78-Ubuntu SMP Fri Dec 9 23:50:32 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux $ cat /etc/lsb-release DISTRIB_ID=Ubuntu DISTRIB_RELEASE=16.04 DISTRIB_CODENAME=xenial DISTRIB_DESCRIPTION="Ubuntu 16.04.1 LTS" $ virsh -v 1.3.1 $ qemu-system-x86_64 --version QEMU emulator version 2.5.0 (Debian 1:2.5+dfsg-5ubuntu10.6), Copyright (c) 2003-2008 Fabrice Bellard $ gns3 --version 1.5.2
参考資料
- Cumulus公式 / Using GNS3 with QEMU and KVM Virtual Machines
- これを見れば KVM + GNS3 で Cumulus VX を動かすところまでは問題ない筈
- Cumulus公式 / Download Cumulus VX
- イメージのダウンロードリンク
- Cumulus公式 / Cumulus Linux 3.2.1 Release Notes
- Early Access Features として EVPN が載っている
- Cumulus公式 / Ethernet Virtual Private Network - EVPN
- EVPN requires Cumulus Linux version 3.2.1 or newer.
- 現時点のEVPN回りのマニュアル
- 例示環境では BGP unnumbered を使っている
- Cumulus公式 / EVPN for controller-less VXLAN
- white paper
構築
以下のような環境を作ります。
GNS3 でデプロイ
以下の感じでデプロイしていきます。(陰っているところは、相互接続実験するための既存環境なので無視)
Cumulus VX に関しては、以下の公式docに従ってポチポチしとけばよいでしょー。
自分の環境では以下程度で十分でした。
kotetsu@kvm01:~$ ps aux | grep [C]umulus root 28241 2.5 1.3 1417576 445408 pts/12 Sl+ 20:32 0:20 /usr/bin/qemu-system-x86_64 -name CumulusVX_bb03 -m 512M -smp cpus=1 -enable-kvm -boot order=c -drive file=/home/kotetsu/GNS3/projects/vqfx/project-files/qemu/25f56fdc-48e7-4622-be73-bf98d5686e4e/hda_disk.qcow2,if=ide,index=0,media=disk -serial telnet:127.0.0.1:5018,server,nowait -monitor tcp:127.0.0.1:37529,server,nowait -net none -device virtio-net-pci,mac=00:37:c4:6e:4e:00,netdev=gns3-0 -netdev socket,id=gns3-0,udp=127.0.0.1:10102,localaddr=127.0.0.1:10103 -device virtio-net-pci,mac=00:37:c4:6e:4e:01,netdev=gns3-1 -netdev socket,id=gns3-1,udp=127.0.0.1:10125,localaddr=127.0.0.1:10124 -device virtio-net-pci,mac=00:37:c4:6e:4e:02,netdev=gns3-2 -netdev socket,id=gns3-2,udp=127.0.0.1:10129,localaddr=127.0.0.1:10128 -device virtio-net-pci,mac=00:37:c4:6e:4e:03,netdev=gns3-3 -netdev socket,id=gns3-3,udp=127.0.0.1:10133,localaddr=127.0.0.1:10132 -device virtio-net-pci,mac=00:37:c4:6e:4e:04,netdev=gns3-4 -netdev socket,id=gns3-4,udp=127.0.0.1:10137,localaddr=127.0.0.1:10136 -device virtio-net-pci,mac=00:37:c4:6e:4e:05
周辺機器設定
torSW[34]01a (Open vSwitch) 設定
Open vSwitch
の導入なんかは、適当に公式ドキュメントを見て進めて頂くとして。(雑)
以下のような設定をしておけば良いですよ。今回は Open vSwitch を使っていますが、ここに置くのは LACP と VLAN が動けばなんでもよいので、適当に各々が使いやすいやつを入れればよいかと。(勿論Cumulus VXでもok)
torSW[12]01a
共通
# ovs-vsctl --no-wait init # ovs-vsctl add-br br0 # ovs-vsctl set bridge br0 datapath_type=netdev # ovs-vsctl add-bond br0 bond0 ens4 ens5 lacp=active bond_mode=balance-slb other_config:lacp-time=fast # ovs-vsctl add-port br0 ens6 tag=100 # ovs-vsctl add-port br0 ens7 tag=200 # ip link set dev br0 up # ip link set dev ens4 up # ip link set dev ens5 up # ip link set dev ens6 up # ip link set dev ens7 up
通信確認用 node[34]1 設定
通信できりゃー何でもよいです。(雑)
kotetsu@node31:~$ ip a show dev ens4 3: ens4: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000 link/ether 00:37:c4:55:09:01 brd ff:ff:ff:ff:ff:ff inet 192.168.1.3/24 brd 192.168.1.255 scope global ens4 valid_lft forever preferred_lft forever inet6 fe80::237:c4ff:fe55:901/64 scope link valid_lft forever preferred_lft forever
kotetsu@node41:~$ ip a show dev ens4 3: ens4: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000 link/ether 00:37:c4:56:b4:01 brd ff:ff:ff:ff:ff:ff inet 192.168.1.4/24 brd 192.168.1.255 scope global ens4 valid_lft forever preferred_lft forever inet6 fe80::237:c4ff:fe56:b401/64 scope link valid_lft forever preferred_lft forever
Cumulus VX 初期設定
ログインアカウント/パスワードは Cumulus公式 / Using Cumulus VX with KVM に書いてある通り、ユーザ cumulus
パスワード CumulusLinux!
で
あとは
らへんを見ながら適当に...hostname、操作用ユーザ作成とssh鍵登録、syslog、timezone, ntp などの設定を環境に合わせた感じでどうぞ。
追加したユーザで net
コマンド各種を使いたい場合は /etc/netd.conf
で許可するユーザ、グループ設定を適宜編集して反映 (Cumulus公式 / Network Command Line Utility / Adding More NCLU Users or Groups)
Cumulus VX 物理IF/BGP設定
以下のような感じのを作っていきます。
物理IF
Cumulus公式 / Interface Configuration and Management あたりを参考に、まずはBGP構成をとるための物理IF設定を。
- bb03
net add interface swp1 alias DEV=spine31 IF=swp1 net add interface swp1 mtu 9216 net add interface swp1 ip address 192.0.2.8/31 net add interface swp2 alias DEV=spine32 IF=swp1 net add interface swp2 mtu 9216 net add interface swp2 ip address 192.0.2.10/31 net add interface swp3 alias DEV=spine41 IF=swp1 net add interface swp3 mtu 9216 net add interface swp3 ip address 192.0.2.12/31 net add interface swp4 alias DEV=spine42 IF=swp1 net add interface swp4 mtu 9216 net add interface swp4 ip address 192.0.2.14/31 net commit
kotetsu@bb03:~$ net show interface all Name Speed MTU Mode Summary ----- -------------------------- ------- ----- ------------- ------------------------ UP lo N/A 65536 Loopback IP: 127.0.0.1/8, ::1/128 UP eth0 1G 1500 Mgmt IP: 10.0.0.193/24 UP swp1 (DEV=spine31 IF=swp1) 1G 9216 Interface/L3 IP: 192.0.2.8/31 UP swp2 (DEV=spine32 IF=swp1) 1G 9216 Interface/L3 IP: 192.0.2.10/31 UP swp3 (DEV=spine41 IF=swp1) 1G 9216 Interface/L3 IP: 192.0.2.12/31 UP swp4 (DEV=spine42 IF=swp1) 1G 9216 Interface/L3 IP: 192.0.2.14/31 ADMDN swp5 0M 1500 NotConfigured
kotetsu@bb03:~$ cat /etc/network/interfaces # This file describes the network interfaces available on your system # and how to activate them. For more information, see interfaces(5). source /etc/network/interfaces.d/*.intf # The loopback network interface auto lo iface lo inet loopback # The primary network interface auto eth0 iface eth0 address 10.0.0.193/24 gateway 10.0.0.254 auto swp1 iface swp1 address 192.0.2.8/31 alias DEV=spine31 IF=swp1 mtu 9216 auto swp2 iface swp2 address 192.0.2.10/31 alias DEV=spine32 IF=swp1 mtu 9216 auto swp3 iface swp3 address 192.0.2.12/31 alias DEV=spine41 IF=swp1 mtu 9216 auto swp4 iface swp4 address 192.0.2.14/31 alias DEV=spine42 IF=swp1 mtu 9216
- bb04
net add interface swp1 alias DEV=spine31 IF=swp2 net add interface swp1 mtu 9216 net add interface swp1 ip address 192.0.2.136/31 net add interface swp2 alias DEV=spine32 IF=swp2 net add interface swp2 mtu 9216 net add interface swp2 ip address 192.0.2.138/31 net add interface swp3 alias DEV=spine41 IF=swp2 net add interface swp3 mtu 9216 net add interface swp3 ip address 192.0.2.140/31 net add interface swp4 alias DEV=spine42 IF=swp2 net add interface swp4 mtu 9216 net add interface swp4 ip address 192.0.2.142/31 net commit
- spine31
net add interface swp1 alias DEV=bb03 IF=swp1 net add interface swp1 mtu 9216 net add interface swp1 ip address 192.0.2.9/31 net add interface swp2 alias DEV=bb04 IF=swp1 net add interface swp2 mtu 9216 net add interface swp2 ip address 192.0.2.137/31 net commit
- spine32
net add interface swp1 alias DEV=bb03 IF=swp2 net add interface swp1 mtu 9216 net add interface swp1 ip address 192.0.2.11/31 net add interface swp2 alias DEV=bb04 IF=swp2 net add interface swp2 mtu 9216 net add interface swp2 ip address 192.0.2.139/31 net commit
- spine41
net add interface swp1 alias DEV=bb03 IF=swp3 net add interface swp1 mtu 9216 net add interface swp1 ip address 192.0.2.13/31 net add interface swp2 alias DEV=bb04 IF=swp3 net add interface swp2 mtu 9216 net add interface swp2 ip address 192.0.2.141/31 net commit
- spine42
net add interface swp1 alias DEV=bb03 IF=swp4 net add interface swp1 mtu 9216 net add interface swp1 ip address 192.0.2.15/31 net add interface swp2 alias DEV=bb04 IF=swp4 net add interface swp2 mtu 9216 net add interface swp2 ip address 192.0.2.143/31 net commit
Early Access版Quagga導入
デフォルトは以下の感じなので Cumulus公式 / Ethernet Virtual Private Network - EVPN / Installing the EVPN Package に従い、Early Access版の Quagga を入れる。
kotetsu@bb03:~$ dpkg -l quagga Desired=Unknown/Install/Remove/Purge/Hold | Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend |/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad) ||/ Name Version Architecture Description +++-=======================-================-================-==================================================== ii quagga 1.0.0+cl3u7 amd64 BGP/OSPF/RIP routing daemon
kotetsu@bb03:~$ grep -E "CumulusLinux-3-early-access" /etc/apt/sources.list #deb http://repo3.cumulusnetworks.com/repo CumulusLinux-3-early-access cumulus #deb-src http://repo3.cumulusnetworks.com/repo CumulusLinux-3-early-access cumulus
kotetsu@bb03:~$ sudo sed -i -e '/CumulusLinux-3-early-access/ s/^#//g' /etc/apt/sources.list
kotetsu@bb03:~$ sudo apt update kotetsu@bb03:~$ sudo apt install -y cumulus-evpn kotetsu@bb03:~$ sudo apt upgrade
kotetsu@bb03:~$ dpkg -l quagga Desired=Unknown/Install/Remove/Purge/Hold | Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend |/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad) ||/ Name Version Architecture Description +++-=======================-================-================-==================================================== ii quagga 1.0.0+cl3eau8 amd64 BGP/OSPF/RIP routing daemon
Quagga起動設定
デフォルトは以下の感じなので Cumulus公式 / Configuring Cumulus Quagga あたりを参考に、全台で起動設定を。
kotetsu@bb03:~$ grep -Ev "^#" /etc/quagga/daemons zebra=no bgpd=no ospfd=no ospf6d=no ripd=no ripngd=no isisd=n
起動デーモン設定で zebra
と bgpd
を yes
に変えて
kotetsu@bb03:~$ sudo sed -r -i -e 's/(zebra|bgpd)=no/\1=yes/g' /etc/quagga/daemons
自動起動設定して起動
kotetsu@bb03:~$ sudo systemctl enable quagga.service kotetsu@bb03:~$ sudo systemctl start quagga.service
kotetsu@bb03:~$ sudo systemctl status quagga.service ... Active: active (running) since Mon 2017-03-20 10:52:25 JST; 4s ago Mar 20 10:52:24 spine41 quagga[30608]: Starting Quagga daemons (prio:10):. zebra. bgpd. Mar 20 10:52:24 spine41 bgpd[30631]: BGPd 1.0.0+cl3eau8 starting: vty@2605, bgp@<all>:179 Mar 20 10:52:24 spine41 zebra[30624]: client 12 says hello and bids fair to announce only bgp routes Mar 20 10:52:24 spine41 watchquagga[30638]: watchquagga 1.0.0+cl3eau8 watching [zebra bgpd], mode [phased zebra restart] Mar 20 10:52:24 spine41 watchquagga[30638]: bgpd state -> up : connect succeeded Mar 20 10:52:25 spine41 watchquagga[30638]: zebra state -> up : connect succeeded Mar 20 10:52:25 spine41 watchquagga[30638]: Watchquagga: Notifying Systemd we are up and running Mar 20 10:52:25 spine41 quagga[30608]: Starting Quagga monitor daemon: watchquagga. Mar 20 10:52:25 spine41 quagga[30608]: Exiting from the script Mar 20 10:52:25 spine41 systemd[1]: Started Cumulus Linux Quagga.
eBGP設定
bb03
net add loopback lo ip address 172.31.0.3/32 net add bgp autonomous-system 65000 net add bgp router-id 172.31.0.3 net add routing prefix-list ipv4 PL_LO_CLOS seq 10 permit 172.16.0.0/12 ge 32 le 32 net add routing prefix-list ipv4 PL_LO_CLOS seq 20 permit 192.0.2.0/24 ge 31 le 31 net add bgp redistribute connected net add bgp neighbor PEER_SPINE peer-group net add bgp neighbor PEER_SPINE prefix-list PL_LO_CLOS out net add bgp neighbor PEER_SPINE next-hop-self net add bgp neighbor 192.0.2.9 remote-as 65003 net add bgp neighbor 192.0.2.9 description spine31 net add bgp neighbor 192.0.2.9 peer-group PEER_SPINE net add bgp neighbor 192.0.2.11 remote-as 65003 net add bgp neighbor 192.0.2.11 description spine32 net add bgp neighbor 192.0.2.11 peer-group PEER_SPINE net add bgp neighbor 192.0.2.13 remote-as 65004 net add bgp neighbor 192.0.2.13 description spine41 net add bgp neighbor 192.0.2.13 peer-group PEER_SPINE net add bgp neighbor 192.0.2.15 remote-as 65004 net add bgp neighbor 192.0.2.15 description spine42 net add bgp neighbor 192.0.2.15 peer-group PEER_SPINE
bb04
net add loopback lo ip address 172.31.0.4/32 net add bgp autonomous-system 65000 net add bgp router-id 172.31.0.4 net add routing prefix-list ipv4 PL_LO_CLOS seq 10 permit 172.16.0.0/12 ge 32 le 32 net add routing prefix-list ipv4 PL_LO_CLOS seq 20 permit 192.0.2.0/24 ge 31 le 31 net add bgp redistribute connected net add bgp neighbor PEER_SPINE peer-group net add bgp neighbor PEER_SPINE prefix-list PL_LO_CLOS out net add bgp neighbor 192.0.2.137 remote-as 65003 net add bgp neighbor 192.0.2.137 description spine31 net add bgp neighbor 192.0.2.137 peer-group PEER_SPINE net add bgp neighbor 192.0.2.139 remote-as 65003 net add bgp neighbor 192.0.2.139 description spine32 net add bgp neighbor 192.0.2.139 peer-group PEER_SPINE net add bgp neighbor 192.0.2.141 remote-as 65004 net add bgp neighbor 192.0.2.141 description spine41 net add bgp neighbor 192.0.2.141 peer-group PEER_SPINE net add bgp neighbor 192.0.2.143 remote-as 65004 net add bgp neighbor 192.0.2.143 description spine42 net add bgp neighbor 192.0.2.143 peer-group PEER_SPINE
spine31
net add loopback lo ip address 172.16.3.1/32 net add bgp autonomous-system 65003 net add bgp router-id 172.16.3.1 net add routing prefix-list ipv4 PL_LO_CLOS seq 10 permit 172.16.0.0/12 ge 32 le 32 net add routing prefix-list ipv4 PL_LO_CLOS seq 20 permit 192.0.2.0/24 ge 31 le 31 net add bgp redistribute connected net add bgp neighbor PEER_BB peer-group net add bgp neighbor PEER_BB prefix-list PL_LO_CLOS out net add bgp neighbor 192.0.2.8 remote-as 65000 net add bgp neighbor 192.0.2.8 description bb03 net add bgp neighbor 192.0.2.8 peer-group PEER_BB net add bgp neighbor 192.0.2.136 remote-as 65000 net add bgp neighbor 192.0.2.136 description bb04 net add bgp neighbor 192.0.2.136 peer-group PEER_BB
spine32
net add loopback lo ip address 172.16.3.2/32 net add bgp autonomous-system 65003 net add bgp router-id 172.16.3.2 net add routing prefix-list ipv4 PL_LO_CLOS seq 10 permit 172.16.0.0/12 ge 32 le 32 net add routing prefix-list ipv4 PL_LO_CLOS seq 20 permit 192.0.2.0/24 ge 31 le 31 net add bgp redistribute connected net add bgp neighbor PEER_BB peer-group net add bgp neighbor PEER_BB prefix-list PL_LO_CLOS out net add bgp neighbor 192.0.2.10 remote-as 65000 net add bgp neighbor 192.0.2.10 description bb03 net add bgp neighbor 192.0.2.10 peer-group PEER_BB net add bgp neighbor 192.0.2.138 remote-as 65000 net add bgp neighbor 192.0.2.138 description bb04 net add bgp neighbor 192.0.2.138 peer-group PEER_BB
spine41
net add loopback lo ip address 172.16.4.1/32 net add bgp autonomous-system 65004 net add bgp router-id 172.16.4.1 net add routing prefix-list ipv4 PL_LO_CLOS seq 10 permit 172.16.0.0/12 ge 32 le 32 net add routing prefix-list ipv4 PL_LO_CLOS seq 20 permit 192.0.2.0/24 ge 31 le 31 net add bgp redistribute connected net add bgp neighbor PEER_BB peer-group net add bgp neighbor PEER_BB prefix-list PL_LO_CLOS out net add bgp neighbor 192.0.2.12 remote-as 65000 net add bgp neighbor 192.0.2.12 description bb03 net add bgp neighbor 192.0.2.12 peer-group PEER_BB net add bgp neighbor 192.0.2.140 remote-as 65000 net add bgp neighbor 192.0.2.140 description bb04 net add bgp neighbor 192.0.2.140 peer-group PEER_BB
spine42
net add loopback lo ip address 172.16.4.2/32 net add bgp autonomous-system 65004 net add bgp router-id 172.16.4.2 net add routing prefix-list ipv4 PL_LO_CLOS seq 10 permit 172.16.0.0/12 ge 32 le 32 net add routing prefix-list ipv4 PL_LO_CLOS seq 20 permit 192.0.2.0/24 ge 31 le 31 net add bgp redistribute connected net add bgp neighbor PEER_BB peer-group net add bgp neighbor PEER_BB prefix-list PL_LO_CLOS out net add bgp neighbor 192.0.2.14 remote-as 65000 net add bgp neighbor 192.0.2.14 description bb03 net add bgp neighbor 192.0.2.14 peer-group PEER_BB net add bgp neighbor 192.0.2.142 remote-as 65000 net add bgp neighbor 192.0.2.142 description bb04 net add bgp neighbor 192.0.2.142 peer-group PEER_BB
ちなみに...neighbor
設定をしようと何となく tab
を押したら、LLDPで得た隣接機器の情報と物理IFのマッピングが表示された...しゅごい...。
kotetsu@bb03:~$ net add bgp neighbor <bgppeer> : BGP neighbor or peer-group <interface> : An interface name "swp1" or glob "swp1-4,6,10-12" <ip> : An IPv4 or IPv6 Address <text-peer-group> : A BGP peer-group name eth0 : LLDP peer spine41 lo : interface swp1 : LLDP peer spine31 swp2 : LLDP peer spine32 swp3 : LLDP peer spine41 swp4 : LLDP peer spine42 swp5 : interface
Cumulus VX MLAG 設定
Cumulus公式 / Multi-Chassis Link Aggregation - MLAG あたりを参考に
MLAG Interlink 設定
まずは MLAG 用の LAG 設定を Cumulus公式 / Bonding - Link Aggregation あたりを参考に設定していきます。 組める最低限の設定だけ...。
spine31
net add bond bond0 bond slaves swp3-4 net add bond bond0 alias DEV=spine32 IF=bond0 net add interface bond0.4094 alias MLAG DEDICATED net add interface bond0.4094 ip address 198.51.100.1/30 net add interface bond0.4094 clag peer-ip 198.51.100.2 net add interface bond0.4094 clag sys-mac 44:38:39:FF:40:94
spine32
net add bond bond0 bond slaves swp3-4 net add bond bond0 alias DEV=spine31 IF=bond0 net add interface bond0.4094 alias MLAG DEDICATED net add interface bond0.4094 ip address 198.51.100.2/30 net add interface bond0.4094 clag peer-ip 198.51.100.1 net add interface bond0.4094 clag sys-mac 44:38:39:FF:40:94
spine41
net add bond bond0 bond slaves swp3-4 net add bond bond0 alias DEV=spine42 IF=bond0 net add interface bond0.4094 alias MLAG DEDICATED net add interface bond0.4094 ip address 198.51.100.1/30 net add interface bond0.4094 clag peer-ip 198.51.100.2 net add interface bond0.4094 clag sys-mac 44:38:39:FF:40:94
spine42
net add bond bond0 bond slaves swp3-4 net add bond bond0 alias DEV=spine41 IF=bond0 net add interface bond0.4094 alias MLAG DEDICATED net add interface bond0.4094 ip address 198.51.100.2/30 net add interface bond0.4094 clag peer-ip 198.51.100.1 net add interface bond0.4094 clag sys-mac 44:38:39:FF:40:94
こんな感じで MLAG が組めている筈。
kotetsu@spine31:~$ net show clag status The peer is alive Peer Priority, ID, and Role: 32768 00:37:c4:a9:0f:03 primary Our Priority, ID, and Role: 32768 00:37:c4:f8:17:03 secondary Peer Interface and IP: bond0.4094 198.51.100.2 Backup IP: (inactive) System MAC: 44:38:39:ff:40:94
kotetsu@spine32:~$ net show clag status The peer is alive Our Priority, ID, and Role: 32768 00:37:c4:a9:0f:03 primary Peer Priority, ID, and Role: 32768 00:37:c4:f8:17:03 secondary Peer Interface and IP: bond0.4094 198.51.100.1 Backup IP: (inactive) System MAC: 44:38:39:ff:40:94
MLAG DownLink 設定
spine3[12]
net add bond bond1 bond slaves swp5 net add bond bond1 alias DEV=torSW301a IF=bond0 net add bond bond1 mtu 9000 net add bond bond1 clag id 1
spine4[12]
net add bond bond1 bond slaves swp5 net add bond bond1 alias DEV=torSW401a IF=bond0 net add bond bond1 mtu 9000 net add bond bond1 clag id 1
bridge 設定
spine[34][12]
全台で
例によって Cumulus公式 / VLAN-aware Bridge Mode for Large-scale Layer 2 Environments を参考にして
net add bridge bridge ports bond0 net add bridge bridge ports bond1 net add bridge bridge vids 2-4093
torSW での LACP 状態確認
root@torSW301a:~# ovs-appctl lacp/show bond0 ---- bond0 ---- status: active negotiated sys_id: 00:37:c4:7e:e0:01 sys_priority: 65534 aggregation key: 1 lacp_time: fast slave: ens4: current attached port_id: 2 port_priority: 65535 may_enable: true actor sys_id: 00:37:c4:7e:e0:01 actor sys_priority: 65534 actor port_id: 2 actor port_priority: 65535 actor key: 1 actor state: activity timeout aggregation synchronized collecting distributing partner sys_id: 44:38:39:ff:40:94 partner sys_priority: 65535 partner port_id: 1 partner port_priority: 255 partner key: 9 partner state: activity timeout aggregation synchronized collecting distributing slave: ens5: current attached port_id: 1 port_priority: 65535 may_enable: true actor sys_id: 00:37:c4:7e:e0:01 actor sys_priority: 65534 actor port_id: 1 actor port_priority: 65535 actor key: 1 actor state: activity timeout aggregation synchronized collecting distributing partner sys_id: 44:38:39:ff:40:94 partner sys_priority: 65535 partner port_id: 1 partner port_priority: 255 partner key: 9 partner state: activity timeout aggregation synchronized collecting distributing
root@torSW401a:~# ovs-appctl lacp/show bond0 ---- bond0 ---- status: active negotiated sys_id: 00:37:c4:2c:e5:01 sys_priority: 65534 aggregation key: 1 lacp_time: fast slave: ens4: current attached port_id: 1 port_priority: 65535 may_enable: true actor sys_id: 00:37:c4:2c:e5:01 actor sys_priority: 65534 actor port_id: 1 actor port_priority: 65535 actor key: 1 actor state: activity timeout aggregation synchronized collecting distributing partner sys_id: 44:38:39:ff:40:94 partner sys_priority: 65535 partner port_id: 1 partner port_priority: 255 partner key: 9 partner state: activity timeout aggregation synchronized collecting distributing slave: ens5: current attached port_id: 2 port_priority: 65535 may_enable: true actor sys_id: 00:37:c4:2c:e5:01 actor sys_priority: 65534 actor port_id: 2 actor port_priority: 65535 actor key: 1 actor state: activity timeout aggregation synchronized collecting distributing partner sys_id: 44:38:39:ff:40:94 partner sys_priority: 65535 partner port_id: 1 partner port_priority: 255 partner key: 9 partner state: activity timeout aggregation synchronized collecting distributing
Cumulus VX VXLAN+EVPN 設定
仮想VTEPごとの仮想IPアドレス設定
spine3[12]
net add loopback lo clag vxlan-anycast-ip 172.16.3.100
spine4[12]
net add loopback lo clag vxlan-anycast-ip 172.16.4.100
本環境では redistribute connected
で BGP ipv4 に流していて out
でかけている prefix-list
にもマッチする設定にしたので、これでこの仮想IPアドレスも広告される筈
kotetsu@bb03:~$ net show route show ip route ============= Codes: K - kernel route, C - connected, S - static, R - RIP, O - OSPF, I - IS-IS, B - BGP, P - PIM, T - Table, v - VNC, V - VPN, > - selected route, * - FIB route K>* 0.0.0.0/0 via 10.0.0.254, eth0 C>* 10.0.0.0/24 is directly connected, eth0 B>* 172.16.3.1/32 [20/0] via 192.0.2.9, swp1, 08:58:27 B>* 172.16.3.2/32 [20/0] via 192.0.2.11, swp2, 08:57:25 B>* 172.16.3.100/32 [20/0] via 192.0.2.9, swp1, 00:28:29 * via 192.0.2.11, swp2, 00:28:29 B>* 172.16.4.1/32 [20/0] via 192.0.2.13, swp3, 08:56:25 B>* 172.16.4.2/32 [20/0] via 192.0.2.15, swp4, 08:56:10 B>* 172.16.4.100/32 [20/0] via 192.0.2.15, swp4, 00:27:45 * via 192.0.2.13, swp3, 00:27:45 C>* 172.31.0.3/32 is directly connected, lo C>* 192.0.2.8/31 is directly connected, swp1 C>* 192.0.2.10/31 is directly connected, swp2 C>* 192.0.2.12/31 is directly connected, swp3 C>* 192.0.2.14/31 is directly connected, swp4 B>* 192.0.2.136/31 [20/0] via 192.0.2.9, swp1, 08:58:27 B>* 192.0.2.138/31 [20/0] via 192.0.2.11, swp2, 08:57:25 B>* 192.0.2.140/31 [20/0] via 192.0.2.13, swp3, 08:56:25 B>* 192.0.2.142/31 [20/0] via 192.0.2.15, swp4, 08:56:10
spine全台にVXLAN VNI設定
spine[34][12]
net add vxlan vxlan010100 vxlan id 10100 net add vxlan vxlan010100 bridge access 100 net add vxlan vxlan010200 vxlan id 10200 net add vxlan vxlan010200 bridge access 200
これで net commit
すると、この vxlan インターフェース群は自動的に bridge にくっついてくる
kotetsu@spine42:~$ net commit --- /etc/network/interfaces 2017-03-20 22:07:22.297455993 +0900 +++ /var/run/nclu/iface/interfaces.tmp 2017-03-20 22:17:54.341064283 +0900 ... iface bridge - bridge-ports bond0 bond1 + bridge-ports bond0 bond1 vxlan010100 vxlan010200 ...
全台にVXLAN Tunnel IPアドレスを付与
spine31
net add vxlan vxlan010100 vxlan local-tunnelip 172.16.3.1 net add vxlan vxlan010200 vxlan local-tunnelip 172.16.3.1
spine32
net add vxlan vxlan010100 vxlan local-tunnelip 172.16.3.2 net add vxlan vxlan010200 vxlan local-tunnelip 172.16.3.2
spine41
net add vxlan vxlan010100 vxlan local-tunnelip 172.16.4.1 net add vxlan vxlan010200 vxlan local-tunnelip 172.16.4.1
spine42
net add vxlan vxlan010100 vxlan local-tunnelip 172.16.4.2 net add vxlan vxlan010200 vxlan local-tunnelip 172.16.4.2
EVPN 有効化~設定
Cumulus公式 / Ethernet Virtual Private Network - EVPN / Configuring EVPN に従って設定していきます。
Early Access 版の機能(quagga限定でCLIまでは)どうも探した感じでは net
コマンドはまだ用意されていないようなので、従来の Quagga 設定で
kotetsu@spine41:~$ sudo vtysh Hello, this is Quagga (version 1.0.0+cl3eau8). Copyright 1996-2005 Kunihiro Ishiguro, et al. spine41# spine41# configure terminal spine41(config)# router bgp 65004 spine41(config-router)# address-family evpn spine41(config-router-af)# neighbor PEER_BB activate spine41(config-router-af)# advertise-all-vni spine41(config-router-af)# end spine41# write memory Note: this version of vtysh never writes vtysh.conf Building Configuration... Integrated configuration saved to /etc/quagga/Quagga.conf [OK] spine41# spine41# exit kotetsu@spine41:~$
以下のような設定を
bb0[34]
router bgp 65000 address-family evpn neighbor PEER_SPINE activate
spine3[12]
router bgp 65003 address-family evpn neighbor PEER_BB activate advertise-all-vni
spine4[12]
router bgp 65004 address-family evpn neighbor PEER_BB activate advertise-all-vni
Disabling Data Plane MAC Learning over VXLAN Tunnels
spine[34][12]
にて /etc/network/interfaces
を編集して、全vxlanインターフェースに bridge-learning off
を追記しておきます。
kotetsu@spine31:~$ diff -u /var/tmp/etc_network_interfaces /etc/network/interfaces --- /var/tmp/etc_network_interfaces 2017-03-20 23:19:45.046311072 +0900 +++ /etc/network/interfaces 2017-03-20 23:20:33.701311345 +0900 @@ -64,6 +64,7 @@ auto vxlan010100 iface vxlan010100 bridge-access 100 + bridge-learning off mstpctl-bpduguard yes mstpctl-portbpdufilter yes vxlan-id 10100 @@ -72,6 +73,7 @@ auto vxlan010200 iface vxlan010200 bridge-access 200 + bridge-learning off mstpctl-bpduguard yes mstpctl-portbpdufilter yes vxlan-id 10200
動作確認
通信確認
End End での通信確認(L2 over L3)
kotetsu@node31:~$ ping 192.168.1.4 PING 192.168.1.4 (192.168.1.4) 56(84) bytes of data. 64 bytes from 192.168.1.4: icmp_seq=1 ttl=64 time=4.67 ms 64 bytes from 192.168.1.4: icmp_seq=2 ttl=64 time=1.86 ms 64 bytes from 192.168.1.4: icmp_seq=3 ttl=64 time=1.81 ms 64 bytes from 192.168.1.4: icmp_seq=4 ttl=64 time=1.94 ms 64 bytes from 192.168.1.4: icmp_seq=5 ttl=64 time=2.07 ms 64 bytes from 192.168.1.4: icmp_seq=6 ttl=64 time=1.24 ms 64 bytes from 192.168.1.4: icmp_seq=7 ttl=64 time=1.78 ms ^C --- 192.168.1.4 ping statistics --- 7 packets transmitted, 7 received, 0% packet loss, time 6009ms rtt min/avg/max/mdev = 1.241/2.199/4.677/1.041 ms
kotetsu@node31:~$ ip n show 192.168.1.4 dev ens4 lladdr 00:37:c4:56:b4:01 STALE
kotetsu@node41:~$ ip n show 192.168.1.3 dev ens4 lladdr 00:37:c4:55:09:01 STALE
Cumulus VX 各種テーブル確認
Cumulus公式 / Ethernet Virtual Private Network - EVPN / Output Commands に参照系のコマンドが色々と提示されているので、それを見ながら。
spine MAC アドレステーブル
まずは VTEP, EVPN PE として動作している spine 群の MAC アドレステーブルを。
TunnelDest
列で対向 VTEP の共有loopback IPアドレスを使っていることが伺えるMAC
列で00:00:00:00:00:00
と表示されているのはBUM traffic replication
らしい(公式の記載より)
kotetsu@spine31:~$ net show bridge macs VLAN Master Interface MAC TunnelDest State Flags LastSeen -------- -------- ----------- ----------------- ------------ --------- ------- ---------- 100 bridge bond1 00:37:c4:55:09:01 00:01:43 100 bridge vxlan010100 00:37:c4:56:b4:01 00:03:52 untagged vxlan010100 00:00:00:00:00:00 172.16.4.100 permanent self 01:08:33 untagged vxlan010100 00:37:c4:56:b4:01 172.16.4.100 self 00:03:58 untagged vxlan010200 00:00:00:00:00:00 172.16.4.100 permanent self 01:08:33 untagged bridge bond0 00:37:c4:f8:17:03 permanent 05:38:27 untagged bridge bond1 00:37:c4:f8:17:05 permanent 03:42:10 untagged bridge vxlan010100 a6:21:d1:0c:20:a8 permanent 02:20:01 untagged bridge vxlan010200 de:8e:ed:62:05:12 permanent 02:20:01
kotetsu@spine32:~$ net show bridge macs VLAN Master Interface MAC TunnelDest State Flags LastSeen -------- -------- ----------- ----------------- ------------ --------- ------- ---------- 100 bridge bond1 00:37:c4:55:09:01 00:04:45 100 bridge vxlan010100 00:37:c4:56:b4:01 00:04:51 untagged vxlan010100 00:00:00:00:00:00 172.16.4.100 permanent self 01:09:25 untagged vxlan010100 00:37:c4:56:b4:01 172.16.4.100 self 00:04:51 untagged vxlan010200 00:00:00:00:00:00 172.16.4.100 permanent self 01:09:25 untagged bridge bond0 00:37:c4:a9:0f:03 permanent 05:38:42 untagged bridge bond1 00:37:c4:a9:0f:05 permanent 03:42:00 untagged bridge vxlan010100 ea:60:31:c9:77:63 permanent 02:18:00 untagged bridge vxlan010200 06:f9:9e:92:a4:c0 permanent 02:18:00
kotetsu@spine41:~$ net show bridge macs VLAN Master Interface MAC TunnelDest State Flags LastSeen -------- -------- ----------- ----------------- ------------ --------- ------- ---------- 100 bridge bond1 00:37:c4:56:b4:01 00:01:43 100 bridge vxlan010100 00:37:c4:55:09:01 00:01:49 untagged vxlan010100 00:00:00:00:00:00 172.16.3.100 permanent self 01:06:24 untagged vxlan010100 00:37:c4:55:09:01 172.16.3.100 self 00:01:49 untagged vxlan010200 00:00:00:00:00:00 172.16.3.100 permanent self 01:06:24 untagged bridge bond0 00:37:c4:fe:34:03 permanent 05:34:25 untagged bridge bond1 00:37:c4:fe:34:05 permanent 03:35:31 untagged bridge vxlan010100 46:bf:75:c3:83:e3 permanent 02:14:34 untagged bridge vxlan010200 ca:4e:29:fd:d9:8e permanent 02:14:34
kotetsu@spine42:~$ net show bridge macs VLAN Master Interface MAC TunnelDest State Flags LastSeen -------- -------- ----------- ----------------- ------------ --------- ------- ---------- 100 bridge bond1 00:37:c4:56:b4:01 00:00:46 100 bridge vxlan010100 00:37:c4:55:09:01 00:02:49 untagged vxlan010100 00:00:00:00:00:00 172.16.3.100 permanent self 01:07:29 untagged vxlan010100 00:37:c4:55:09:01 172.16.3.100 self 00:02:55 untagged vxlan010200 00:00:00:00:00:00 172.16.3.100 permanent self 01:07:29 untagged bridge bond0 00:37:c4:32:db:03 permanent 05:35:24 untagged bridge bond1 00:37:c4:32:db:05 permanent 03:36:18 untagged bridge vxlan010100 9e:e4:df:d2:a9:3a permanent 02:15:20 untagged bridge vxlan010200 6a:3a:0d:08:fb:9e permanent 02:15:20
広告している VNI や VTEP 情報
sudo vtysh
から
spine31# show bgp evpn vni Advertise All VNI flag: Enabled Number of VNIs: 2 Flags: * - Kernel VNI Orig IP RD Import RT Export RT * 10200 172.16.3.100 172.16.3.1:10200 65003:10200 65003:10200 * 10100 172.16.3.100 172.16.3.1:10100 65003:10100 65003:10100 spine31# show evpn vni Number of VNIs: 2 VNI VxLAN IF VTEP IP # MACs Remote VTEPs 10200 vxlan010200 172.16.3.100 0 172.16.4.100 10100 vxlan010100 172.16.3.100 2 172.16.4.100
spine32# show bgp evpn vni Advertise All VNI flag: Enabled Number of VNIs: 2 Flags: * - Kernel VNI Orig IP RD Import RT Export RT * 10200 172.16.3.100 172.16.3.2:10200 65003:10200 65003:10200 * 10100 172.16.3.100 172.16.3.2:10100 65003:10100 65003:10100 spine32# show evpn vni Number of VNIs: 2 VNI VxLAN IF VTEP IP # MACs Remote VTEPs 10200 vxlan010200 172.16.3.100 0 172.16.4.100 10100 vxlan010100 172.16.3.100 2 172.16.4.100
spine41# show bgp evpn vni Advertise All VNI flag: Enabled Number of VNIs: 2 Flags: * - Kernel VNI Orig IP RD Import RT Export RT * 10200 172.16.4.100 172.16.4.1:10200 65004:10200 65004:10200 * 10100 172.16.4.100 172.16.4.1:10100 65004:10100 65004:10100 spine41# show evpn vni Number of VNIs: 2 VNI VxLAN IF VTEP IP # MACs Remote VTEPs 10200 vxlan010200 172.16.4.100 0 172.16.3.100 10100 vxlan010100 172.16.4.100 2 172.16.3.100
spine42# show bgp evpn vni Advertise All VNI flag: Enabled Number of VNIs: 2 Flags: * - Kernel VNI Orig IP RD Import RT Export RT * 10200 172.16.4.100 172.16.4.2:10200 65004:10200 65004:10200 * 10100 172.16.4.100 172.16.4.2:10100 65004:10100 65004:10100 spine42# show evpn vni Number of VNIs: 2 VNI VxLAN IF VTEP IP # MACs Remote VTEPs 10200 vxlan010200 172.16.4.100 0 172.16.3.100 10100 vxlan010100 172.16.4.100 2 172.16.3.100
EVPN 学習経路
自ASの別 spine
からの経路を bb
経由で受け取るように設定してはいないので、RDとしても登場しないです。
自ASのMAC学習同期は、MLAGで良きようにやってくれる筈だから、それで良いかと。
また EVPN Multihoming を使った際には必要になる Type 1,4 に関しても一切情報が登場しません。
suto vtysh
から
spine31# show bgp evpn route BGP table version is 0, local router ID is 172.16.3.1 Status codes: s suppressed, d damped, h history, * valid, > best, i - internal Origin codes: i - IGP, e - EGP, ? - incomplete EVPN type-2 prefix: [2]:[ESI]:[EthTag]:[MAClen]:[MAC] EVPN type-3 prefix: [3]:[EthTag]:[IPlen]:[OrigIP] Network Next Hop Metric LocPrf Weight Path Route Distinguisher: 172.16.3.1:10100 *> [2]:[0]:[0]:[48]:[00:37:c4:55:09:01] 172.16.3.100 32768 i *> [3]:[0]:[32]:[172.16.3.100] 172.16.3.100 32768 i Route Distinguisher: 172.16.3.1:10200 *> [3]:[0]:[32]:[172.16.3.100] 172.16.3.100 32768 i Route Distinguisher: 172.16.4.1:10100 * [2]:[0]:[0]:[48]:[00:37:c4:56:b4:01] 172.16.4.100 0 65000 65004 i *> [2]:[0]:[0]:[48]:[00:37:c4:56:b4:01] 172.16.4.100 0 65000 65004 i * [3]:[0]:[32]:[172.16.4.100] 172.16.4.100 0 65000 65004 i *> [3]:[0]:[32]:[172.16.4.100] 172.16.4.100 0 65000 65004 i Route Distinguisher: 172.16.4.1:10200 * [3]:[0]:[32]:[172.16.4.100] 172.16.4.100 0 65000 65004 i *> [3]:[0]:[32]:[172.16.4.100] 172.16.4.100 0 65000 65004 i Route Distinguisher: 172.16.4.2:10100 * [2]:[0]:[0]:[48]:[00:37:c4:56:b4:01] 172.16.4.100 0 65000 65004 i *> [2]:[0]:[0]:[48]:[00:37:c4:56:b4:01] 172.16.4.100 0 65000 65004 i * [3]:[0]:[32]:[172.16.4.100] 172.16.4.100 0 65000 65004 i *> [3]:[0]:[32]:[172.16.4.100] 172.16.4.100 0 65000 65004 i Route Distinguisher: 172.16.4.2:10200 * [3]:[0]:[32]:[172.16.4.100] 172.16.4.100 0 65000 65004 i *> [3]:[0]:[32]:[172.16.4.100] 172.16.4.100 0 65000 65004 i Displayed 9 prefixes (15 paths)
spine32# show bgp evpn route BGP table version is 0, local router ID is 172.16.3.2 Status codes: s suppressed, d damped, h history, * valid, > best, i - internal Origin codes: i - IGP, e - EGP, ? - incomplete EVPN type-2 prefix: [2]:[ESI]:[EthTag]:[MAClen]:[MAC] EVPN type-3 prefix: [3]:[EthTag]:[IPlen]:[OrigIP] Network Next Hop Metric LocPrf Weight Path Route Distinguisher: 172.16.3.2:10100 *> [2]:[0]:[0]:[48]:[00:37:c4:55:09:01] 172.16.3.100 32768 i *> [3]:[0]:[32]:[172.16.3.100] 172.16.3.100 32768 i Route Distinguisher: 172.16.3.2:10200 *> [3]:[0]:[32]:[172.16.3.100] 172.16.3.100 32768 i Route Distinguisher: 172.16.4.1:10100 * [2]:[0]:[0]:[48]:[00:37:c4:56:b4:01] 172.16.4.100 0 65000 65004 i *> [2]:[0]:[0]:[48]:[00:37:c4:56:b4:01] 172.16.4.100 0 65000 65004 i * [3]:[0]:[32]:[172.16.4.100] 172.16.4.100 0 65000 65004 i *> [3]:[0]:[32]:[172.16.4.100] 172.16.4.100 0 65000 65004 i Route Distinguisher: 172.16.4.1:10200 * [3]:[0]:[32]:[172.16.4.100] 172.16.4.100 0 65000 65004 i *> [3]:[0]:[32]:[172.16.4.100] 172.16.4.100 0 65000 65004 i Route Distinguisher: 172.16.4.2:10100 *> [2]:[0]:[0]:[48]:[00:37:c4:56:b4:01] 172.16.4.100 0 65000 65004 i * [2]:[0]:[0]:[48]:[00:37:c4:56:b4:01] 172.16.4.100 0 65000 65004 i * [3]:[0]:[32]:[172.16.4.100] 172.16.4.100 0 65000 65004 i *> [3]:[0]:[32]:[172.16.4.100] 172.16.4.100 0 65000 65004 i Route Distinguisher: 172.16.4.2:10200 * [3]:[0]:[32]:[172.16.4.100] 172.16.4.100 0 65000 65004 i *> [3]:[0]:[32]:[172.16.4.100] 172.16.4.100 0 65000 65004 i Displayed 9 prefixes (15 paths)
spine41# show bgp evpn route BGP table version is 0, local router ID is 172.16.4.1 Status codes: s suppressed, d damped, h history, * valid, > best, i - internal Origin codes: i - IGP, e - EGP, ? - incomplete EVPN type-2 prefix: [2]:[ESI]:[EthTag]:[MAClen]:[MAC] EVPN type-3 prefix: [3]:[EthTag]:[IPlen]:[OrigIP] Network Next Hop Metric LocPrf Weight Path Route Distinguisher: 172.16.3.1:10100 * [2]:[0]:[0]:[48]:[00:37:c4:55:09:01] 172.16.3.100 0 65000 65003 i *> [2]:[0]:[0]:[48]:[00:37:c4:55:09:01] 172.16.3.100 0 65000 65003 i * [3]:[0]:[32]:[172.16.3.100] 172.16.3.100 0 65000 65003 i *> [3]:[0]:[32]:[172.16.3.100] 172.16.3.100 0 65000 65003 i Route Distinguisher: 172.16.3.1:10200 * [3]:[0]:[32]:[172.16.3.100] 172.16.3.100 0 65000 65003 i *> [3]:[0]:[32]:[172.16.3.100] 172.16.3.100 0 65000 65003 i Route Distinguisher: 172.16.3.2:10100 * [2]:[0]:[0]:[48]:[00:37:c4:55:09:01] 172.16.3.100 0 65000 65003 i *> [2]:[0]:[0]:[48]:[00:37:c4:55:09:01] 172.16.3.100 0 65000 65003 i * [3]:[0]:[32]:[172.16.3.100] 172.16.3.100 0 65000 65003 i *> [3]:[0]:[32]:[172.16.3.100] 172.16.3.100 0 65000 65003 i Route Distinguisher: 172.16.3.2:10200 * [3]:[0]:[32]:[172.16.3.100] 172.16.3.100 0 65000 65003 i *> [3]:[0]:[32]:[172.16.3.100] 172.16.3.100 0 65000 65003 i Route Distinguisher: 172.16.4.1:10100 *> [2]:[0]:[0]:[48]:[00:37:c4:56:b4:01] 172.16.4.100 32768 i *> [3]:[0]:[32]:[172.16.4.100] 172.16.4.100 32768 i Route Distinguisher: 172.16.4.1:10200 *> [3]:[0]:[32]:[172.16.4.100] 172.16.4.100 32768 i Displayed 9 prefixes (15 paths)
spine42# show bgp evpn route BGP table version is 0, local router ID is 172.16.4.2 Status codes: s suppressed, d damped, h history, * valid, > best, i - internal Origin codes: i - IGP, e - EGP, ? - incomplete EVPN type-2 prefix: [2]:[ESI]:[EthTag]:[MAClen]:[MAC] EVPN type-3 prefix: [3]:[EthTag]:[IPlen]:[OrigIP] Network Next Hop Metric LocPrf Weight Path Route Distinguisher: 172.16.3.1:10100 * [2]:[0]:[0]:[48]:[00:37:c4:55:09:01] 172.16.3.100 0 65000 65003 i *> [2]:[0]:[0]:[48]:[00:37:c4:55:09:01] 172.16.3.100 0 65000 65003 i * [3]:[0]:[32]:[172.16.3.100] 172.16.3.100 0 65000 65003 i *> [3]:[0]:[32]:[172.16.3.100] 172.16.3.100 0 65000 65003 i Route Distinguisher: 172.16.3.1:10200 * [3]:[0]:[32]:[172.16.3.100] 172.16.3.100 0 65000 65003 i *> [3]:[0]:[32]:[172.16.3.100] 172.16.3.100 0 65000 65003 i Route Distinguisher: 172.16.3.2:10100 * [2]:[0]:[0]:[48]:[00:37:c4:55:09:01] 172.16.3.100 0 65000 65003 i *> [2]:[0]:[0]:[48]:[00:37:c4:55:09:01] 172.16.3.100 0 65000 65003 i * [3]:[0]:[32]:[172.16.3.100] 172.16.3.100 0 65000 65003 i *> [3]:[0]:[32]:[172.16.3.100] 172.16.3.100 0 65000 65003 i Route Distinguisher: 172.16.3.2:10200 * [3]:[0]:[32]:[172.16.3.100] 172.16.3.100 0 65000 65003 i *> [3]:[0]:[32]:[172.16.3.100] 172.16.3.100 0 65000 65003 i Route Distinguisher: 172.16.4.2:10100 *> [2]:[0]:[0]:[48]:[00:37:c4:56:b4:01] 172.16.4.100 32768 i *> [3]:[0]:[32]:[172.16.4.100] 172.16.4.100 32768 i Route Distinguisher: 172.16.4.2:10200 *> [3]:[0]:[32]:[172.16.4.100] 172.16.4.100 32768 i Displayed 9 prefixes (15 paths)
VXLAN関係にはノータッチで転送土管に徹する bb
も、EVPN signaling 用のMP-BGPには参加します。
bb03# show bgp evpn route BGP table version is 0, local router ID is 172.31.0.3 Status codes: s suppressed, d damped, h history, * valid, > best, i - internal Origin codes: i - IGP, e - EGP, ? - incomplete EVPN type-2 prefix: [2]:[ESI]:[EthTag]:[MAClen]:[MAC] EVPN type-3 prefix: [3]:[EthTag]:[IPlen]:[OrigIP] Network Next Hop Metric LocPrf Weight Path Route Distinguisher: 172.16.3.1:10100 *> [2]:[0]:[0]:[48]:[00:37:c4:55:09:01] 172.16.3.100 0 65003 i *> [3]:[0]:[32]:[172.16.3.100] 172.16.3.100 0 65003 i Route Distinguisher: 172.16.3.1:10200 *> [3]:[0]:[32]:[172.16.3.100] 172.16.3.100 0 65003 i Route Distinguisher: 172.16.3.2:10100 *> [2]:[0]:[0]:[48]:[00:37:c4:55:09:01] 172.16.3.100 0 65003 i *> [3]:[0]:[32]:[172.16.3.100] 172.16.3.100 0 65003 i Route Distinguisher: 172.16.3.2:10200 *> [3]:[0]:[32]:[172.16.3.100] 172.16.3.100 0 65003 i Route Distinguisher: 172.16.4.1:10100 *> [2]:[0]:[0]:[48]:[00:37:c4:56:b4:01] 172.16.4.100 0 65004 i *> [3]:[0]:[32]:[172.16.4.100] 172.16.4.100 0 65004 i Route Distinguisher: 172.16.4.1:10200 *> [3]:[0]:[32]:[172.16.4.100] 172.16.4.100 0 65004 i Route Distinguisher: 172.16.4.2:10100 *> [2]:[0]:[0]:[48]:[00:37:c4:56:b4:01] 172.16.4.100 0 65004 i *> [3]:[0]:[32]:[172.16.4.100] 172.16.4.100 0 65004 i Route Distinguisher: 172.16.4.2:10200 *> [3]:[0]:[32]:[172.16.4.100] 172.16.4.100 0 65004 i Displayed 12 prefixes (12 paths)
bb04# show bgp evpn route BGP table version is 0, local router ID is 172.31.0.4 Status codes: s suppressed, d damped, h history, * valid, > best, i - internal Origin codes: i - IGP, e - EGP, ? - incomplete EVPN type-2 prefix: [2]:[ESI]:[EthTag]:[MAClen]:[MAC] EVPN type-3 prefix: [3]:[EthTag]:[IPlen]:[OrigIP] Network Next Hop Metric LocPrf Weight Path Route Distinguisher: 172.16.3.1:10100 *> [2]:[0]:[0]:[48]:[00:37:c4:55:09:01] 172.16.3.100 0 65003 i *> [3]:[0]:[32]:[172.16.3.100] 172.16.3.100 0 65003 i Route Distinguisher: 172.16.3.1:10200 *> [3]:[0]:[32]:[172.16.3.100] 172.16.3.100 0 65003 i Route Distinguisher: 172.16.3.2:10100 *> [2]:[0]:[0]:[48]:[00:37:c4:55:09:01] 172.16.3.100 0 65003 i *> [3]:[0]:[32]:[172.16.3.100] 172.16.3.100 0 65003 i Route Distinguisher: 172.16.3.2:10200 *> [3]:[0]:[32]:[172.16.3.100] 172.16.3.100 0 65003 i Route Distinguisher: 172.16.4.1:10100 *> [2]:[0]:[0]:[48]:[00:37:c4:56:b4:01] 172.16.4.100 0 65004 i *> [3]:[0]:[32]:[172.16.4.100] 172.16.4.100 0 65004 i Route Distinguisher: 172.16.4.1:10200 *> [3]:[0]:[32]:[172.16.4.100] 172.16.4.100 0 65004 i Route Distinguisher: 172.16.4.2:10100 *> [2]:[0]:[0]:[48]:[00:37:c4:56:b4:01] 172.16.4.100 0 65004 i *> [3]:[0]:[32]:[172.16.4.100] 172.16.4.100 0 65004 i Route Distinguisher: 172.16.4.2:10200 *> [3]:[0]:[32]:[172.16.4.100] 172.16.4.100 0 65004 i Displayed 12 prefixes (12 paths)
EVPN 学習経路(特定RDをドリルダウンして)
sudo vtysh
から
bb03# show bgp evpn route rd 172.16.3.2:10100 EVPN type-2 prefix: [2]:[ESI]:[EthTag]:[MAClen]:[MAC] EVPN type-3 prefix: [3]:[EthTag]:[IPlen]:[OrigIP] BGP routing table entry for 172.16.3.2:10100:[2]:[0]:[0]:[48]:[00:37:c4:55:09:01] Paths: (1 available, best #1) Advertised to non peer-group peers: spine31(192.0.2.9) spine32(192.0.2.11) spine41(192.0.2.13) spine42(192.0.2.15) Route [2]:[0]:[0]:[48]:[00:37:c4:55:09:01] VNI 10100 65003 172.16.3.100 from spine32(192.0.2.11) (172.16.3.2) Origin IGP, localpref 100, valid, external, bestpath-from-AS 65003, best Extended Community: RT:65003:10100 ET:8 AddPath ID: RX 0, TX 138 Last update: Tue Mar 21 21:51:06 2017 BGP routing table entry for 172.16.3.2:10100:[3]:[0]:[32]:[172.16.3.100] Paths: (1 available, best #1) Advertised to non peer-group peers: spine31(192.0.2.9) spine32(192.0.2.11) spine41(192.0.2.13) spine42(192.0.2.15) Route [3]:[0]:[32]:[172.16.3.100] 65003 172.16.3.100 from spine32(192.0.2.11) (172.16.3.2) Origin IGP, localpref 100, valid, external, bestpath-from-AS 65003, best Extended Community: RT:65003:10100 ET:8 AddPath ID: RX 0, TX 110 Last update: Tue Mar 21 20:57:31 2017
パケットを眺める
ControlPlane
EVPN NLRI Type3(Inclusive Multicast Ethernet Tag route)
spine41
が bb03
と eBGP OPEN 直後に送信している UPDATE です。
(EVPN Multihoming との比較という意味で)注目すべきは Originating Router's IP Address
として spine4[12]
で組んでいる共有?loopback IPアドレス(172.16.4.100
)が入っていることでしょう。
また Cumulus公式 / Ethernet Virtual Private Network - EVPN / Enabling EVPN with Route Distinguishers (RDs) and Route Targets (RTs)andRouteTargets(RTs)) に記載がある通り、RD
や RT
は明示的に設定せずとも自動付与された情報が入っています。
EVPN NLRI Type2(MAC/IP Advertisement route)
spine41
が bb03
に node41(at VLAN100:VNI10100)
の MAC アドレスを広告する図です。
DataPlane
BUM
node31
からの ARP Request は spine31
から bb03
に送信されています。
VXLANカプセル外側の IP ヘッダを見ると、Src が 172.16.3.100 (spine3[12] の共有lo IPaddr)
で Dst が 172.16.4.100(spine4[12] の共有loopback IPアドレス)
になっており、各ペアが2台で共有?loopback IPアドレスを使った論理?VTEPを構成していることが分かります。
VXLAN 的には HER(Head End Replication) 動作。
なお、2017/03/21
時点で公式ページのHERに関する注意書きを読むと、HER で構成可能な VTEP 数は 128 だそうです。
Cumulus Linux verified support for up to 128 VTEPs with head end replication.
Unicast
spine41
から bb04
方面に送信される node41
から node31
への ICMP Echo Reply の様子。
ただのVXLANカプセル化されたパケットですが、外側のIPヘッダを見ると共有loopback同士での通信になっています。
MLAG 動作
単なる MLAG の切り替わりでしかなく、仮想環境での障害試験なので、超簡単に...。
トラフィック が bb03 -> spine41 -> torSW401a -> node41
という経路で流れている状態で spine41
の downlink を sudo ifdown swp5
で down させると、即時
bb03 -> spine41 -> spine42 -> torSW401a -> node41
という経路に切り替わりました。
spine4[12]
で torSW401a
に組んでいる LAG や仮想loopbackは up したままなので、特に EVPN 的な WithDrawn なども発生せずです。
おしまい
以下、所感です。
- Cumulus Linux
- VX の軽さが良い
- Network Command Line Utility(NCLU) というラッパの使い勝手が良い
- ドキュメントがちゃんと揃っているのが良い (今回とりあげたのは EA 版機能なのに)
- だから僕の説明が雑なのは仕方ない