Time to continue our deep dive into the NSX Edge Node VMs. In the previous posts we talked about the relationship between the physical NICs and the Edge Node VMs. And in the post after that about the relationship between the Tier-0 and which uplinks it should use, called deterministic peering.
Today we’re going to look at one very specific detail; the teaming and failover policy for the Edge Node vNICs trunk portgroups in vCenter.
Note: the situation I'm discussing here is only for when the Edge Node VM TEP VLAN is separate from the ESXi TEP VLAN, as is the current best practice. If you are using the same VLAN for both ESXi and Edge Node VM TEPs, this post doesn't apply as they aren't using vCenter Portgroups, but NSX Segments.
Portgroups for VMs
The Edge Node VMs are VMs like any other. They have vNICs which must be attached to a portgroup in vCenter to provide connectivity.
To provide redundancy, each portgroup has a teaming and failover configuration applied, configured through vCenter, to determine which physical uplink should be used and at what capacity.
But what has always confused me is: how does this apply when we take into consideration the named teaming policy and failover we configured in NSX itself? You know, what we defined in our uplink policy, our transport zones, and our uplink segments?
So, why should be configure it like below?
And why don’t we put the unused NICs in the ‘unused’ category since we defined them to not be used in our named teaming policy? If we look at the Edge Uplink Profile configuration in NSX, it seems redundant:
But wait, there’s more
The named teaming policy we defined only applies to the upstream Tier-0 Gateway connections. But that is only half of the work the Edge Node VM does; it also receives GENEVE traffic on the TEPs on those same interfaces.
So while the named teaming policies apply to the upstream VLAN connections, the portgroup active/standby settings apply to the TEP downstream Overlay connections.
The TEPs are pinned to the vNICs of the Edge Node VM and should always be active. If for example pNIC0 fails, and the active uplink with it, the TEP isn’t notified and stays active! So if it is attached to a portgroup that doesn’t have a standby uplink configured, the traffic is blackholed.
This also follows the unsupported failure scenario as described by Shashank Mohan in this blog post. A vNIC to the Edge Node VM should never be disconnected. This is a state that could never happen, as this is a connection between two virtual components. Thus it makes sense that there aren’t any mechanisms in place to recover from this either.
You’d think that the pNIC failure would be pushed down through the N-VDS and mark the TEP as unavailable, but that doesn’t happen. There is no communication to the N-VDS when there is a physical NIC failure.
That’s why you always want to have a standby uplink defined in the vCenter uplink trunk portgroup, to allow the TEP to have a redundant connection in the case of a physical NIC failure.
Big shout out to my colleague Mitchel for prompting me to delve into this, and Marco for helping me figure it all out!
The original article was posted on: significant-bit.com