CARP, ifstated, and firewalls

The Problem

When using CARP to support failover on multiple firewalls, the protocol itself can be insufficient. Consider the following configuration:

Normal CARP Cluster

Traffic flows through the upper firewall, the lower firewall simply listens for CARP announcements on both interfaces, and seeing the CARP announcements from the uppper firewall, it stays silent. If the upper firewall fails, within seconds the lower one will notice the absence of adventisements, and take over the CARP'd ip addresses on both sides. Service continues as normal. But a problem arises if there is a network problem that affects only the master firewall. For example, consider a failure of the switch port on one side of the upper firewall:

Unsync'd CARP Cluster

If the network link state does not change, neither firewall has any idea of where the problem is. The lower firewall no longer hears CARP announcements on this side, so becomes master, taking over the shared address. Incoming packets are now routed in through the lower firewall. However, the upper firewall still still thinks it is master on both interfaces, as it is not recieving any announcements from the lower one. Outgoing traffic continues to be routed to the upper firewall, and is effectively blackholed.

The Solution

In order to solve the problem of unsynchronised carp interfaces, we need to have some alternate test of connectivity, and some method of ensuring that the working system is master on all network segments. ifstated(8) can be used do to this.

The configuration in the sample ifstated.conf defines 4 main states; each firewall actually swiches between two of the states. During normal operation, the upper firewall is in the master state, with an advskew of 10, and the bottom firewall is in the backup state, with an advskew of 100.

When the bottom firewall detects that the link states of the carp(4) interfaces are no longer synchronised, it uses ping to see if it still has connectivity on both sides. If it does, then the failure must be in the upper firewall's connectivity. It changes to the promoted state, and changes the advskew on all it's carp interfaces to 0 in order to become master of all the carp interfaces.

The upper firewall has a similar state change: it conducts connectivity tests on a regular basis; if it detects a loss of connectivity, or a loss of carp(4) link state synchronisation, it moves to a demoted state, setting the advskew on all it's interfaces to 254. It continues to monitor connectivity, and returns to master if connectivity is restored.

NOTE: It is possible to operate such a failover pair without the master to demoted state transitions - in fact without runnig ifstated(8) on the upper firewall at all. But doing this is required for real redundancy: it ensures that correct failover will still take place if there is a problem with ifstated(8) on the lower firewall.

Future Improvement

There are alternatives for both the connectivity test and forcing the failed system to relinquish the carp addresses on all interfaces, which may be less error prone in situations:

If pfsync(4) is being used to synchronise states between the two firewalls, a pfsync watchdog could watch for pfsync messages which indicate that traffic is passing through the other firewall correctly. If If it detects that t no more traffic is flowing end-to-end through the other system, it considers that system down, and ifstated can take action.

Taking over the carp addresses from the failed firewall by modifying the advskew has problems as well. Firstly, it assumes that there is no failure with the functioning of the carp protocol, either via a software failure, or filtering of some carp packets on the network. Secondly, if there is a second failure while a system is in the promoted, such as a failure in ifstated on that system, other systems will not be able to force it to give up the CARP'd addresses. Therefore, it would be desireable for us to have STONITH (Shoot The Other Node In The Head) capability as a sort of final solution to deadlocks. A number of different mechanisms have been proposed, including using WoL (Wake On Lan), or sending a BREAK to the serial console on the other box, dropping it to the debugger