Cumulus Linux Ethernet link-state monitoring using ifplugd

This blog post is about link-state monitoring under Cumulus Linux. Cumulus has no own builtin tool for this and recommends using ifplugd. The tool has some similarities to Cisco’s IP SLA which can track the state of interfaces.

The main reason to use ifplugd is for split-brain scenarios, when you lose the peerlink between Cumulus Linux CLAG pairs. If the peerlink goes down the CLAG primary switch stays active member and the secondary would automatically disable all CLAG bonds to force the connected servers to failover to the CLAG primary switch to keep the network operational. 

Very important you need to configure clagd-backup-ip because this is needed for Cumulus Linux to still be able to communicate to it’s neighbour if they lose the peerlink.

Now ifplugd is important for all connected servers which are not using CLAG bonds, basically servers which are using the normal active/standby teaming which doesn’t require a CLAG bonding configuration. These ports are configured as normal access ports, so an peerlink failure would normally keep these ports up if you don’t configure ifplugd.

Ifplugd needs to be installed and configured on both switches running CLAG, follow the steps below.

Install ifplugd service:

sudo apt-get update
sudo apt-get install ifplugd

Edit the file /etc/default/ifplugd and add the lines below

The delay is set to -d10 moderate 10 seconds because of combination with CLAG. Need to see and lower the value over time.

INTERFACES="peerlink"
HOTPLUG_INTERFACES=""
ARGS="-q -f -u0 -d10 -w -I"
SUSPEND_ACTION="stop"

Edit the file /etc/ifplugd/action.d/ifupdown

The variable $SWITCHPORTS defines which ports ifplugd should shutdown if the peerlink goes down. We came up with to use a custom variable instead of shutting down all ports because CLAG is taking care of configured bonds.

#!/bin/sh

# The peerlink bond interface
PEERLINK=peerlink

# The switchports to bring down on peerlink failure
#
# enslosures 01/02: swp5..swp8
SWITCHPORTS=$(seq -f swp%g 5 8)
# storage system 01/02 : swp19..swp22
SWITCHPORTS="$SWITCHPORTS $(seq -f swp%g 19 22)"
# server1/server2: swp27..swp28
SWITCHPORTS="$SWITCHPORTS $(seq -f swp%g 27 28)"
# VMware cluster: swp35..swp38
SWITCHPORTS="$SWITCHPORTS $(seq -f swp%g 35 38)"

case "$1" in

    "$PEERLINK")
        clagrole=$(clagctl | grep "Our Priority" | awk '{print $8}')
	case "$2" in
	    up | down)
		action=$2
		if [ "$clagrole" = "secondary" ]; then
		    for interface in $SWITCHPORTS; do
			echo "bringing $action : $interface"
			ip link set $interface $action
		    done
		fi
		;;
	esac
	;;

esac

Start ifplugd service

sudo systemctl restart ifplugd.service

Impact of a simulated peerlink failure from the server perspective:

2017-09-19T11:43:15.665057+00:00 leaf-01-c ifplugd(peerlink)[5292]: Link beat lost.
2017-09-19T11:43:25.775585+00:00 leaf-01-c ifplugd(peerlink)[5292]: Executing '/etc/ifplugd/ifplugd.action peerlink down'.
2017-09-19T11:43:25.902637+00:00 leaf-01-c ifplugd(peerlink)[5292]: Program executed successfully.
[email protected]:/home/cumulus# 

[email protected]:/home/cumulus# grep ifplugd /var/log/syslog
[...]
2017-09-19T11:43:15.780727+00:00 leaf-02-c ifplugd(peerlink)[12600]: Link beat lost.
2017-09-19T11:43:25.891584+00:00 leaf-02-c ifplugd(peerlink)[12600]: Executing '/etc/ifplugd/ifplugd.action peerlink down'.
2017-09-19T11:43:26.107140+00:00 leaf-02-c ifplugd(peerlink)[12600]: client: bringing down : swp5
2017-09-19T11:43:26.146421+00:00 leaf-02-c ifplugd(peerlink)[12600]: client: bringing down : swp6
2017-09-19T11:43:26.171454+00:00 leaf-02-c ifplugd(peerlink)[12600]: client: bringing down : swp7
2017-09-19T11:43:26.193387+00:00 leaf-02-c ifplugd(peerlink)[12600]: client: bringing down : swp8
64 bytes from 8.8.8.8: icmp_seq=1623 ttl=59 time=0.524 ms
64 bytes from 8.8.8.8: icmp_seq=1624 ttl=59 time=0.782 ms
64 bytes from 8.8.8.8: icmp_seq=1625 ttl=59 time=0.847 ms
Request timeout for icmp_seq 1626
Request timeout for icmp_seq 1627
Request timeout for icmp_seq 1628
Request timeout for icmp_seq 1629
Request timeout for icmp_seq 1630
Request timeout for icmp_seq 1631
Request timeout for icmp_seq 1632
Request timeout for icmp_seq 1633
Request timeout for icmp_seq 1634
Request timeout for icmp_seq 1635
Request timeout for icmp_seq 1636
Request timeout for icmp_seq 1637
Request timeout for icmp_seq 1638
64 bytes from 8.8.8.8: icmp_seq=1639 ttl=59 time=0.701 ms
64 bytes from 8.8.8.8: icmp_seq=1640 ttl=59 time=0.708 ms
64 bytes from 8.8.8.8: icmp_seq=1641 ttl=59 time=0.780 ms
64 bytes from 8.8.8.8: icmp_seq=1642 ttl=59 time=0.781 ms

Impact of reconnecting the peerlink from the server perspective:

[email protected]:/home/cumulus# grep ifplugd /var/log/syslog
[...]
2017-09-19T11:48:22.190187+00:00 leaf-01-c ifplugd(peerlink)[5292]: Link beat detected.
2017-09-19T11:48:22.290481+00:00 leaf-01-c ifplugd(peerlink)[5292]: Executing '/etc/ifplugd/ifplugd.action peerlink up'.
2017-09-19T11:48:22.524673+00:00 leaf-01-c ifplugd(peerlink)[5292]: Program executed successfully.

[email protected]:/home/cumulus# grep ifplugd /var/log/syslog
[...]
2017-09-19T11:48:22.084477+00:00 leaf-02-c ifplugd(peerlink)[12600]: Link beat detected.
2017-09-19T11:48:22.232192+00:00 leaf-02-c ifplugd(peerlink)[12600]: Executing '/etc/ifplugd/ifplugd.action peerlink up'.
2017-09-19T11:48:22.812771+00:00 leaf-02-c ifplugd(peerlink)[12600]: client: bringing up : swp5
2017-09-19T11:48:22.816175+00:00 leaf-02-c ifplugd(peerlink)[12600]: client: bringing up : swp6
2017-09-19T11:48:22.831487+00:00 leaf-02-c ifplugd(peerlink)[12600]: client: bringing up : swp7
2017-09-19T11:48:22.836617+00:00 leaf-02-c ifplugd(peerlink)[12600]: client: bringing up : swp8
64 bytes from 8.8.8.8: icmp_seq=24 ttl=59 time=0.614 ms
64 bytes from 8.8.8.8: icmp_seq=25 ttl=59 time=0.680 ms
64 bytes from 8.8.8.8: icmp_seq=26 ttl=59 time=8.932 ms
64 bytes from 8.8.8.8: icmp_seq=27 ttl=59 time=1.126 ms
64 bytes from 8.8.8.8: icmp_seq=28 ttl=59 time=2.424 ms
Request timeout for icmp_seq 29
Request timeout for icmp_seq 30
Request timeout for icmp_seq 31
Request timeout for icmp_seq 32
Request timeout for icmp_seq 33
Request timeout for icmp_seq 34
Request timeout for icmp_seq 35
64 bytes from 8.8.8.8: icmp_seq=36 ttl=59 time=6.491 ms
64 bytes from 8.8.8.8: icmp_seq=37 ttl=59 time=1.045 ms
64 bytes from 8.8.8.8: icmp_seq=38 ttl=59 time=1.244 ms

Yes, it takes a few seconds for your server to reconnect if you have a peerlink failure but it is very important to keep the datacenter network operational.

For more information have a look at the Cumulus Linux documentation: https://docs.cumulusnetworks.com/display/DOCS/ifplugd

Cumulus Linux non-disruptive upgrade procedure on MLAG pairs

I thought it would be useful to know the exact procedure for non-disruptive upgrade on Cumulus Linux MLAG – CLAG pairs. I find the online documentation Upgrading Cumulus Linux a bit short when it comes to running CLAG in what order you have to upgrade the switches with a minimal disruption of traffic..

The following procedure below worked for me on Dell S4048-ON and Dell S3048-ON switches

  • On both switches, run the following command to refresh the package index of the apt repository:
sudo apt-get update
  • Run the following command to determine which switch is CLAG primary- and which switch CLAG secondary:
sudo net show clag

Start upgrading the secondary CLAG member:

  • Shutdown on all interfaces except the peerlink using the commands below.  This will force all traffic through the other switch:
echo swp{1..52} | tr ' ' '\n' | sudo xargs -i ip link set {} down
64 bytes from 8.8.8.8: icmp_seq=8903 ttl=59 time=1.106 ms
64 bytes from 8.8.8.8: icmp_seq=8904 ttl=59 time=0.974 ms
64 bytes from 8.8.8.8: icmp_seq=8905 ttl=59 time=1.643 ms
64 bytes from 8.8.8.8: icmp_seq=8906 ttl=59 time=0.869 ms
Request timeout for icmp_seq 8907
64 bytes from 8.8.8.8: icmp_seq=8908 ttl=59 time=1.256 ms
64 bytes from 8.8.8.8: icmp_seq=8909 ttl=59 time=0.769 ms

(Rollback) If problems are seen revert the change, the commands shown:

echo swp{1..52} | tr ' ' '\n' | sudo xargs -i ip link set {} up

Wait one minute for CLAG to stabilise and verify network communication with the remaining switch.

  • Perform a clean shutdown of clagd on this switch
sudo systemctl stop clagd

(Rollback) If you see problems start clagd again:

sudo systemctl start clagd

Wait one minute for CLAG to cleanly shut down

  • Shutdown peerlink bond
sudo ip link set peerlink down

(Rollback) If you see problems enable peerlink again:

sudo ip link set peerlink up
  • Perform the upgrade using the command:
sudo apt-get upgrade

The reason why it is important to do a clean shutdown of all the ports is that the bridge and peerlink bounces during the package upgrade which could affect the network communication if this happens uncontrolled.

  • Reboot the switch using the command
sudo reboot

Wait for the upgraded switch to come up. This will cause a short outage in traffic.

64 bytes from 8.8.8.8: icmp_seq=9443 ttl=59 time=1.069 ms
64 bytes from 8.8.8.8: icmp_seq=9444 ttl=59 time=1.150 ms
64 bytes from 8.8.8.8: icmp_seq=9445 ttl=59 time=0.993 ms
64 bytes from 8.8.8.8: icmp_seq=9446 ttl=59 time=1.331 ms
Request timeout for icmp_seq 9447
Request timeout for icmp_seq 9448
Request timeout for icmp_seq 9449
64 bytes from 8.8.8.8: icmp_seq=9450 ttl=59 time=1.539 ms
64 bytes from 8.8.8.8: icmp_seq=9451 ttl=59 time=0.908 ms
64 bytes from 8.8.8.8: icmp_seq=9452 ttl=59 time=1.166 ms
64 bytes from 8.8.8.8: icmp_seq=9453 ttl=59 time=1.261 ms

Wait until the network is functioning normally again.

On the secondary, run the following command to take over the primary role:

sudo clagctl priority 0

Wait one minute for CLAG to failvoer

Verify that the CLAG handover has occurred:

sudo net show clag

Repeat steps on the new secondary (old primary) to shutdown all interfaces.

Ones finished you need to reset the clag priority on the primary to its configured default value.

Read my next post, how to rollback if an upgrade failed: Cumulus Linux Snapshot Rollback