Cluster Failover
(17 January 2012) TL/DR: different parts of the cluster can be on different nodes.
It doesn't look easy to figure out what node is actually active as far as the clustering goes if you are just looking at the front panel. Here's an example. We're configured as so:
chassis {
cluster {
reth-count 3;
redundancy-group 0 {
node 0 priority 100;
node 1 priority 1;
}
redundancy-group 1 {
node 0 priority 100;
node 1 priority 1;
interface-monitor {
ge-0/0/5 weight 255;
ge-5/0/5 weight 255;
ge-0/0/6 weight 255;
ge-5/0/6 weight 255;
ge-0/0/7 weight 255;
ge-5/0/7 weight 255;
}
}
}
}
And redundant ethernet devices as so:
interfaces {
(other interfaces excluded)
ge-0/0/5 {
gigether-options {
redundant-parent reth0;
}
}
ge-0/0/6 {
gigether-options {
redundant-parent reth1;
}
}
ge-0/0/7 {
gigether-options {
redundant-parent reth2;
}
}
(other interfaces excluded)
ge-5/0/5 {
gigether-options {
redundant-parent reth0;
}
}
ge-5/0/6 {
gigether-options {
redundant-parent reth1;
}
}
ge-5/0/7 {
gigether-options {
redundant-parent reth2;
}
}
reth0 {
redundant-ether-options {
redundancy-group 1;
}
unit 0 {
family inet {
address 172.16.250.250/24;
}
}
}
reth1 {
redundant-ether-options {
redundancy-group 1;
}
unit 0 {
family inet {
address 192.168.99.1/24;
}
}
}
reth2 {
redundant-ether-options {
redundancy-group 1;
}
unit 0 {
family inet {
address 192.168.98.1/24;
}
}
}
}
The routing engine (RE) is set to use redundancy-group 0 (this is by default in JunOS). We have followed the recommendation to put the redundant ethernet devices (reth) in a different redundancy group. This is because if the RE fails over, all of the RE processes have to be initiated on the new primary. This increases the risk of loss of RTOs such as connection state and the like. This way we can fail interfaces (because our peer switch has failed or whatever) while reducing the risk of state loss while transitioning to a degraded state.
(Note: if you have both sides of a reth plugged into the same switch, this is kind of an academic distinction, as cable failure is unlikely and interface failure will probably be a symptom of catastrophic node failure. Catastrophic failure of node0 will result in the RE being initialized on node1, with the attendent risk of RTO loss, but in this circumstance having node1 providing service of any kind is better than you'd be in a single-firewall configuration, and that's really why you have a cluster in the first place.)
So if you check the cluster status from protocol mode, you get this:
{primary:node0}
root@srx240-cluster> show chassis cluster status
Cluster ID: 1
Node Priority Status Preempt Manual failover
Redundancy group: 0 , Failover count: 1
node0 100 primary no no
node1 1 secondary no no
Redundancy group: 1 , Failover count: 1
node0 100 primary no no
node1 1 secondary no no
In this configuration, the HA LED (HA for High Availability) is green.
Now, to test the HA, I've pulled the cable from node0 ge-0/0/5.
Once I do this, both HA LEDs go red. You have to re-look at the cluster config to see whats wrong:
{primary:node0}
root@srx240-cluster> show chassis cluster status
Cluster ID: 1
Node Priority Status Preempt Manual failover
Redundancy group: 0 , Failover count: 1
node0 100 primary no no
node1 1 secondary no no
Redundancy group: 1 , Failover count: 2
node0 0 secondary no no
node1 1 primary no no
This tells us that for redundancy group 1, node1 is now active. However note that for redundancy group 0, node0 is still active. In this practical example, it means that the interfaces are being used on node1, but the routing engine (ie the engine that makes all the security decisions) is still on node0. This is different from the Netscreen SSG all-or-nothing we are used to.
This means that different parts of the cluster can be on different nodes.
Also note that the Priority for redundancy group 1, node0 is now 0, meaning it is unsuitable for use at the moment.
Looking at the interfaces tells us what's actually wrong:
{primary:node0}
root@srx240-cluster> show chassis cluster interfaces
Control link 0 name: fxp1
Control link status: Up
Fabric interfaces:
Name Child-interface Status
fab0 ge-0/0/2 up
fab0
fab1 ge-5/0/2 up
fab1
Fabric link status: Up
Redundant-ethernet Information:
Name Status Redundancy-group
reth0 Up 1
reth1 Up 1
reth2 Up 1
Interface Monitoring:
Interface Weight Status Redundancy-group
ge-5/0/7 255 Up 1
ge-0/0/7 255 Up 1
ge-5/0/6 255 Up 1
ge-0/0/6 255 Up 1
ge-5/0/5 255 Up 1
ge-0/0/5 255 Down 1
...recognizing that I've pulled the cable from 0/0/5 on node0.
In fact when we put the cable back in, the LEDs go green, then:
{primary:node0}
root@srx240-cluster> show chassis cluster status
Cluster ID: 1
Node Priority Status Preempt Manual failover
Redundancy group: 0 , Failover count: 1
node0 100 primary no no
node1 1 secondary no no
Redundancy group: 1 , Failover count: 2
node0 100 secondary no no
node1 1 primary no no
...note that node1 is still active for redundancy group 1. Node0 is now available for use as it now has Priority 100 again. We either have to set pre-emption, or manually fail redundancy group 1 on node1. I prefer the idea of manual fail-back because pre-emption can get you into a fail-fail-back loop in the case of a soft failure on node0. Usually there's a reason for a failover (even if it is a stupid one) and diagnosing it on failure will be easier if you are not fighting a pre-emption loop.