-
-
Save wido/51cb9880d86f08f73766634d7f6df3f4 to your computer and use it in GitHub Desktop.
#!/usr/bin/env bash | |
# | |
# Use BGP+EVPN for VXLAN with CloudStack instead of Multicast | |
# | |
# Place this file on all KVM hypervisors at /usr/share/modifyvxlan.sh | |
# | |
# More information about BGP and EVPN with FRR: https://vincent.bernat.ch/en/blog/2017-vxlan-bgp-evpn | |
# | |
DSTPORT=4789 | |
# We bind our VXLAN tunnel IP(v4) on Loopback device 'lo' | |
DEV="lo" | |
usage() { | |
echo "Usage: $0: -o <op>(add | delete) -v <vxlan id> -p <pif> -b <bridge name> (-6)" | |
} | |
localAddr() { | |
local FAMILY=$1 | |
if [[ -z "$FAMILY" || $FAMILY == "inet" ]]; then | |
ip -4 -o addr show scope global dev ${DEV} | awk 'NR==1 {gsub("/[0-9]+", "") ; print $4}' | |
fi | |
if [[ "$FAMILY" == "inet6" ]]; then | |
ip -6 -o addr show scope global dev ${DEV} | awk 'NR==1 {gsub("/[0-9]+", "") ; print $4}' | |
fi | |
} | |
addVxlan() { | |
local VNI=$1 | |
local PIF=$2 | |
local VXLAN_BR=$3 | |
local FAMILY=$4 | |
local VXLAN_DEV=vxlan${VNI} | |
local ADDR=$(localAddr ${FAMILY}) | |
echo "local addr for VNI ${VNI} is ${ADDR}" | |
if [[ ! -d /sys/class/net/${VXLAN_DEV} ]]; then | |
ip -f ${FAMILY} link add ${VXLAN_DEV} type vxlan id ${VNI} local ${ADDR} dstport ${DSTPORT} nolearning | |
ip link set ${VXLAN_DEV} up | |
sysctl -qw net.ipv6.conf.${VXLAN_DEV}.disable_ipv6=1 | |
fi | |
if [[ ! -d /sys/class/net/$VXLAN_BR ]]; then | |
ip link add name ${VXLAN_BR} type bridge | |
ip link set ${VXLAN_BR} up | |
sysctl -qw net.ipv6.conf.${VXLAN_BR}.disable_ipv6=1 | |
fi | |
bridge link show|grep ${VXLAN_BR}|awk '{print $2}'|grep "^${VXLAN_DEV}\$" > /dev/null | |
if [[ $? -gt 0 ]]; then | |
ip link set ${VXLAN_DEV} master ${VXLAN_BR} | |
fi | |
} | |
deleteVxlan() { | |
local VNI=$1 | |
local PIF=$2 | |
local VXLAN_BR=$3 | |
local FAMILY=$4 | |
local VXLAN_DEV=vxlan${VNI} | |
ip link set ${VXLAN_DEV} nomaster | |
ip link delete ${VXLAN_DEV} | |
ip link set ${VXLAN_BR} down | |
ip link delete ${VXLAN_BR} type bridge | |
} | |
OP= | |
VNI= | |
FAMILY=inet | |
option=$@ | |
while getopts 'o:v:p:b:6' OPTION | |
do | |
case $OPTION in | |
o) oflag=1 | |
OP="$OPTARG" | |
;; | |
v) vflag=1 | |
VNI="$OPTARG" | |
;; | |
p) pflag=1 | |
PIF="$OPTARG" | |
;; | |
b) bflag=1 | |
BRNAME="$OPTARG" | |
;; | |
6) | |
FAMILY=inet6 | |
;; | |
?) usage | |
exit 2 | |
;; | |
esac | |
done | |
if [[ "$oflag$vflag$pflag$bflag" != "1111" ]]; then | |
usage | |
exit 2 | |
fi | |
lsmod|grep ^vxlan >& /dev/null | |
if [[ $? -gt 0 ]]; then | |
modprobe=`modprobe vxlan 2>&1` | |
if [[ $? -gt 0 ]]; then | |
echo "Failed to load vxlan kernel module: $modprobe" | |
exit 1 | |
fi | |
fi | |
# | |
# Add a lockfile to prevent this script from running twice on the same host | |
# this can cause a race condition | |
# | |
LOCKFILE=/var/run/cloud/vxlan.lock | |
( | |
flock -x -w 10 200 || exit 1 | |
if [[ "$OP" == "add" ]]; then | |
addVxlan ${VNI} ${PIF} ${BRNAME} ${FAMILY} | |
if [[ $? -gt 0 ]]; then | |
exit 1 | |
fi | |
elif [[ "$OP" == "delete" ]]; then | |
deleteVxlan ${VNI} ${PIF} ${BRNAME} ${FAMILY} | |
fi | |
) 200>${LOCKFILE} |
Thanks Wido for sharing this, can I ask you 2 specifics things regarding this?.
1. My hypervisors are based on Ubuntu and the file modifyvxlan.sh already exists but in a different path, should I replace it or just copy in the path you suggest.
Just put the file in /usr/share and restart the cloudstack agent. It will detect the file there. (Also see the comments in the header of the file)
2. Can I use one of the hyp as the route reflector and as vtep?, do you have a file example of the bgp evpn example I can use?
Many thanks!
Not sure what you exactly mean. This is the relevant BGP configuration with FRR I use:
router bgp 4200100145 .. address-family ipv4 unicast network 10.255.255.32/32 neighbor uplinks activate neighbor uplinks next-hop-self neighbor uplinks soft-reconfiguration inbound neighbor uplinks route-map upstream-v4-out out neighbor uplinks route-map upstream-v4-in in exit-address-family ! address-family ipv6 unicast network 2a05:xxxx:xxxx:2::32/128 neighbor uplinks activate neighbor uplinks soft-reconfiguration inbound neighbor uplinks route-map upstream-v6-in in neighbor uplinks route-map upstream-v6-out out exit-address-family address-family l2vpn evpn neighbor uplinks activate advertise-all-vni exit-address-family
Thanks Wido, it's working well for me now, just one thing, if I delete a machine and recreate it with same IP, bgp entry
doesn't seem to update until I create another vxlan.
Thanks Wido, it's working well for me now, just one thing, if I delete a machine and recreate it with same IP, bgp entry doesn't seem to update until I create another vxlan.
Hypervisor you mean? Compute node. That is logical. See the script. It takes the IP which is on the loopback interface and is hardcoded to the VXLAN device.
Use unique IPs per host and do not try to change.
@wido may I know is the script still works for ACS 4.19? Just to be straight, I don't know in detail of how VXLAN and EVPN works and now trying to implement in in my POC environment.
This is my FRR configuration on the host so far.
ip forwarding
ipv6 forwarding
interface ens3f0np0
no ipv6 nd suppress-ra
exit
interface ens3f1np1
no ipv6 nd suppress-ra
exit
router bgp 4200100005
bgp router-id 10.0.118.1
no bgp ebgp-requires-policy
neighbor uplink peer-group
neighbor uplink remote-as external
neighbor ens3f0np0 interface peer-group uplink
neighbor ens3f1np1 interface peer-group uplink
address-family ipv4 unicast
network 10.0.118.1/32
exit-address-family
address-family ipv6 unicast
network 2407:e6c0:0:1::1/128
neighbor uplink activate
neighbor uplink soft-reconfiguration inbound
exit-address-family
address-family l2vpn evpn
neighbor uplink activate
neighbor uplink attribute-unchanged next-hop
advertise-all-vni
exit-address-family
I can see that I'm getting the EVPN routes from my leaf switches (but still don't understand what's RD RT etc).
I tried to use the script to create a VXLAN interface on cloudbr0
for the management network. Am I missing something?
[root@cmpt1 ~]# ./modifyvxlan.sh -o add -v 10027 -b cloudbr0
Usage: ./modifyvxlan.sh: -o <op>(add | delete) -v <vxlan id> -p <pif> -b <bridge name> (-6)
[root@cmpt1 ~]#
Thanks for the help :)
@wido may I know is the script still works for ACS 4.19? Just to be straight, I don't know in detail of how VXLAN and EVPN works and now trying to implement in in my POC environment.
This is my FRR configuration on the host so far.
ip forwarding ipv6 forwarding interface ens3f0np0 no ipv6 nd suppress-ra exit interface ens3f1np1 no ipv6 nd suppress-ra exit router bgp 4200100005 bgp router-id 10.0.118.1 no bgp ebgp-requires-policy neighbor uplink peer-group neighbor uplink remote-as external neighbor ens3f0np0 interface peer-group uplink neighbor ens3f1np1 interface peer-group uplink address-family ipv4 unicast network 10.0.118.1/32 exit-address-family address-family ipv6 unicast network 2407:e6c0:0:1::1/128 neighbor uplink activate neighbor uplink soft-reconfiguration inbound exit-address-family address-family l2vpn evpn neighbor uplink activate neighbor uplink attribute-unchanged next-hop advertise-all-vni exit-address-family
I can see that I'm getting the EVPN routes from my leaf switches (but still don't understand what's RD RT etc).
I tried to use the script to create a VXLAN interface on
cloudbr0
for the management network. Am I missing something?[root@cmpt1 ~]# ./modifyvxlan.sh -o add -v 10027 -b cloudbr0 Usage: ./modifyvxlan.sh: -o <op>(add | delete) -v <vxlan id> -p <pif> -b <bridge name> (-6) [root@cmpt1 ~]#
Thanks for the help :)
Yes, this script should still work with 4.19, no problem at all.
Could you maybe ask this question on the CloudStack mailinglist? I can get back to it there. You can cc me in the e-mail :-)
@wido I've posted a question on the mailing list. Thanks for checking it out later when you're free :)
I cc'ed you in there also.
Has any attempt been made to integrate this into cloudstack proper? If so, what were the issues?
I can see this obviously needs to do things like disable multicast for EVPN where cloudstack enables it in the provided script, but I'd think that could be fairly easily resolved. Ideas for resolution would be some sort of additional setting, or duplicating the vxlan protocol in cloudstack into a new protocol like "vxlan-evpn" which will simply pass a flag into the modifyvxlan.sh script to alter behavior.
I'd be interested in taking this on, but I'd like to know what kind of prior feedback there may have been.
Has any attempt been made to integrate this into cloudstack proper? If so, what were the issues?
I can see this obviously needs to do things like disable multicast for EVPN where cloudstack enables it in the provided script, but I'd think that could be fairly easily resolved. Ideas for resolution would be some sort of additional setting, or duplicating the vxlan protocol in cloudstack into a new protocol like "vxlan-evpn" which will simply pass a flag into the modifyvxlan.sh script to alter behavior.
I'd be interested in taking this on, but I'd like to know what kind of prior feedback there may have been.
Good suggestion! I have opened a Pull Request to at least add this script to the main repository: apache/cloudstack#9778
I have been using this script for 5y now without any issues, it just works as expected.
Hi @wido, thank you for the great work you have on the script, it work flawlessly. VMs in Isolated or L2 Guest network type from different hypervisors can reach other without issues. But i have bumped into a challenge and i need guidance. When i create a Guest network of type Shared with public IP address(with VLAN/VNI), i cant get access to the internet or other physical machines in the same public IP range. Before the change from VLAN to VXLAN isolation method it was working fine.
Hi @wido, thank you for the great work you have on the script, it work flawlessly. VMs in Isolated or L2 Guest network type from different hypervisors can reach other without issues. But i have bumped into a challenge and i need guidance. When i create a Guest network of type Shared with public IP address(with VLAN/VNI), i cant get access to the internet or other physical machines in the same public IP range. Before the change from VLAN to VXLAN isolation method it was working fine.
What routers are you using? And they are the gateway for the shared network, correct? You would need to check if they learn the proper EVPN information through BGP. It seems that you might have an issue there.
This is not a script/CloudStack issue it seems, this is probably a local networking issue. (config).
Could be many, many, many things.
Hi @wido, thank you for the great work you have on the script, it work flawlessly. VMs in Isolated or L2 Guest network type from different hypervisors can reach other without issues. But i have bumped into a challenge and i need guidance. When i create a Guest network of type Shared with public IP address(with VLAN/VNI), i cant get access to the internet or other physical machines in the same public IP range. Before the change from VLAN to VXLAN isolation method it was working fine.
What routers are you using? And they are the gateway for the shared network, correct? You would need to check if they learn the proper EVPN information through BGP. It seems that you might have an issue there.
This is not a script/CloudStack issue it seems, this is probably a local networking issue. (config).
Could be many, many, many things.
I use FRR as a router & BGP router reflector and VMs hosted in different KVM server can communicate with each other regardless of their underlying network. Internet access only work with Isolated network type since they are behind a VR but when using shared network with public IP address they can still communicate with each other but they cant break out to the internet.
Hi @wido, thank you for the great work you have on the script, it work flawlessly. VMs in Isolated or L2 Guest network type from different hypervisors can reach other without issues. But i have bumped into a challenge and i need guidance. When i create a Guest network of type Shared with public IP address(with VLAN/VNI), i cant get access to the internet or other physical machines in the same public IP range. Before the change from VLAN to VXLAN isolation method it was working fine.
What routers are you using? And they are the gateway for the shared network, correct? You would need to check if they learn the proper EVPN information through BGP. It seems that you might have an issue there.
This is not a script/CloudStack issue it seems, this is probably a local networking issue. (config).
Could be many, many, many things.I use FRR as a router & BGP router reflector and VMs hosted in different KVM server can communicate with each other regardless of their underlying network. Internet access only work with Isolated network type since they are behind a VR but when using shared network with public IP address they can still communicate with each other but they cant break out to the internet.
The VMs in this shared network with a public address, can they reach their gateway? Or from the gateway router, can you reach (ping) these VMs?
Hi @wido, thank you for the great work you have on the script, it work flawlessly. VMs in Isolated or L2 Guest network type from different hypervisors can reach other without issues. But i have bumped into a challenge and i need guidance. When i create a Guest network of type Shared with public IP address(with VLAN/VNI), i cant get access to the internet or other physical machines in the same public IP range. Before the change from VLAN to VXLAN isolation method it was working fine.
What routers are you using? And they are the gateway for the shared network, correct? You would need to check if they learn the proper EVPN information through BGP. It seems that you might have an issue there.
This is not a script/CloudStack issue it seems, this is probably a local networking issue. (config).
Could be many, many, many things.I use FRR as a router & BGP router reflector and VMs hosted in different KVM server can communicate with each other regardless of their underlying network. Internet access only work with Isolated network type since they are behind a VR but when using shared network with public IP address they can still communicate with each other but they cant break out to the internet.
The VMs in this shared network with a public address, can they reach their gateway? Or from the gateway router, can you reach (ping) these VMs?
No they can not ping the gateway but then can ping each other.. am trying to figure out how does those VM attached to the br-vxlan2 break out of that bridge to the internet.
Hi @wido, thank you for the great work you have on the script, it work flawlessly. VMs in Isolated or L2 Guest network type from different hypervisors can reach other without issues. But i have bumped into a challenge and i need guidance. When i create a Guest network of type Shared with public IP address(with VLAN/VNI), i cant get access to the internet or other physical machines in the same public IP range. Before the change from VLAN to VXLAN isolation method it was working fine.
What routers are you using? And they are the gateway for the shared network, correct? You would need to check if they learn the proper EVPN information through BGP. It seems that you might have an issue there.
This is not a script/CloudStack issue it seems, this is probably a local networking issue. (config).
Could be many, many, many things.I use FRR as a router & BGP router reflector and VMs hosted in different KVM server can communicate with each other regardless of their underlying network. Internet access only work with Isolated network type since they are behind a VR but when using shared network with public IP address they can still communicate with each other but they cant break out to the internet.
The VMs in this shared network with a public address, can they reach their gateway? Or from the gateway router, can you reach (ping) these VMs?
No they can not ping the gateway but then can ping each other.. am trying to figure out how does those VM attached to the br-vxlan2 break out of that bridge to the internet.
This is than an issue on your local network with the EVPN config on those gateways. Could be many things:
- BGP to gateway routers not working properly
- Incorrect route targets for the VNIs
- Policy config issue
- etc
- etc
Again, this has nothing to do with this particular script.
Thanks Wido for sharing this, can I ask you 2 specifics things regarding this?.
Many thanks!