Get Your HAT On: High Availability Thinking
High Availability (HA) in today’s campus networks has quickly changed from a want-to-have to a must-have. As more and more real-time applications are now being converged onto our data infrastructure, the dependencies on that environment are dramatically increasing.
High Availability can occur on many different levels, but for today let’s start with the network core. This applies to both a three-tier hierarchical design with separate access, distribution and core switches as well as a collapsed-core design in which the core switches provide distribution functionality in the same tier. We see this method of implementation a fair amount depending on the size of the environment.
The company you have decided on for your switching/routing needs will predicate what option(s) you will have for an HA core. Broadly speaking, we have two overall strategies in the core to work with. The first would be a traditional HA pair running some type of highly available redundancy protocol such as VRRP. This is the traditional approach and is still widely used today. In a nutshell, one of the two core switches will act as the primary default gateway for host devices. If that primary core switch fails, the 2nd core switch becomes active. Be aware though – with this there will be some minor failover time as this occurs. We can further minimize this with secondary supervisors (higher cost) in each chassis. On top of this – if we are running L2 access switches – we also have spanning-tree convergence times that can affect real-time application performance. As each access switch should be dual-homed to each core switch, we are faced with loop avoidance issues – namely some form of Spanning Tree algorithm. One of those uplinks will be active in this traditional model and the 2nd will be passively spanned-out. If the primary link fails we have redundancy – but depending on what mode of Spanning Tree you are running we are going to have failover times associated with the convergence.
The 2nd means of providing an HA core, which we are seeing more and more of, involves two core switches which logically can be seen as one core switch. This is called different names by the various vendors out there that have this type of technology, but in the end it solves several problems. The first and most important problem it solves is allowing us to obviate the Spanning-Tree issues mentioned above. From a downstream/access switch perspective, we are still going to dual-home these switches, but now the core appears as one logical switch. We can create LACP connections now which provide dual-active links rather than having a One-Gig (or Ten-Gig) connection sitting there in a downed state. Fiber cabling is expensive and Ten-Gig Ethernet is as well. This allows us to maximize our performance and avoids the loop-convergence issue discussed above.
There are other methods of optimizing your environment to reduce or eliminate convergence times while providing HA such as routing at the access layer, but that is a conversation for another day.
If you’re looking for more information on Network Optimization, register for our September 19th 2012 webinar!