Skip to content

Cisco ACI Best Practices: Upgrade your Fabric with Confidence

Cisco first launched the Application Centric Infrastructure (ACI) in November of 2014.  Since that launch, the solution has proven to be a tremendous success in the Data Center.  I don’t say this to blow our own horn, but rather to make a point that in the past 8 years, Cisco ACI has been widely deployed by customers large and small (and every size in between) across any vertical or industry you can think of.  Internally our engineering team has done a tremendous amount of work to bring new features, capabilities, and topologies at a very rapid pace. All of this while, fixing bugs and addressing security concerns as they are discovered.

The result of such a large install base and choice of software release is that over time we find every possible mix of hardware and software version, feature, and deployment type.  The question I ask myself is this: “Are customers realizing the fullest potential and best outcomes with their investment in ACI?”  In many cases, I can say yes. But there is still room for improvement.  We see many customers on what I would consider older code.   This not such a bad thing but it makes me wonder why.  I have a few assumptions.  Maybe upgrading ACI is seen as complex, or maybe it takes too long, or perhaps the confidence and knowledge in the process isn’t there yet (after all we don’t upgrade every day).  I can sympathize.  ACI fabrics are the foundation of all the important and business critical workloads that run our customers’ businesses.  Upgrades should be approached with planning and care and should be designed for zero to near-zero disruption.  Furthermore, there is a constant balance between feature velocity and code maturity such that there is never one approach that fits all customers.

If you are with me so far, I have some good news to share on a few fronts.

New Software Lifecycle and Cadence

One of the most asked questions we get is “What version of code do you recommend I should be running?” 

That question can sometimes make me sweat a little bit because every customer’s datacenter is unique and built to solve specific requirements and needs.  Everyone’s configuration is different enough that there may not be a one-size fits all answer.  As with anything in IT, it depends.

Imagine a range of customers where on one end you have a profile that cares more about features and capabilities.  We have many of these types of customers, some of them quite large and sophisticated.  They move fast and prefer to push the boundaries of what is possible because it tends to give them an edge in what they are trying to achieve.  On the other end we have a customer profile that is mostly concerned with uptime and stability.  This type is careful, and risk averse but with very good reason.  Mission critical workloads want to avoid any kind of chance of interruption or inconsistency.

Internally, we’ve come up with a new approach that offers a choice to satisfy both types.  With ACI version 6.0, we will introduce a new release cadence (see figure 1).

Figure 1: New ACI and NX-OS Release Cadence

The general idea is to provide clear version lifecycle visibility with consistent timing for when we add or enhance features versus when we are strictly identifying and fixing bugs.

Each major release (6.0, 6.1, 7.0 and beyond) will have a pre-defined lifetime of 4 years.  This way everyone knows upfront where they may be in the cycle with a lot of time to plan for future upgrades when it makes sense to do so.  Furthermore, within each major release, the first 12 months will be all about introducing or enhancing features.  Our engineering teams publish point releases every 3-4 months on average.  The result is that 6.0.1, 6.0.2 and 6.0.3 will all be feature releases.  This is great for those customers who desire features most.  Once we pass that year mark, we will move into a maintenance cycle where we no longer introduce features but focus solely on fixing bugs, enhancing stability and hardening security.

In parallel we are working on the next major release that follows the same pattern but staggered to release a year later.  If you are a profile that desires features first, you can choose to move up to the next major release (from 6.0.x to 6.1.x) but if you are a customer who prioritizes code stability first and foremost, you can continue with the current release across the remainder of its lifetime. Customers can then upgrade years later when those newer major releases have moved into their respective maintenance cycle (and thus get features and stability as they do so).

Upgrade Best Practices

When the time comes to actually do an upgrade, it is best to plan accordingly and go into it with eyes-open for the best results.  Over the years, Cisco has published many documents and technotes detailing the process.  One of the things we’ve realized is that these documents were not all gathered in the same place online and making it hard for customers to have all the info they might need at their fingertips.  In the last year, we’ve re-organized, updated and collected everything related to upgrades and made it available from one landing page.

Even better, we’ve created an online checklist that details each step in the process with links to more information about that step (see figure 2).  This makes it a lot easier to plan, prepare and do the upgrade with minimal or even no downtime.  Following this checklist is the upgrade best practice and we strongly encourage its use.

Figure 2: Cisco ACI Upgrade Checklist

Finally, to help add more color and share experiences, we’ve been delivering webinars to customers and partners about ACI upgrade best practices.   We’ve posted the video recordings of such events in multiple places.

Check out the On-demand webinars for Customers.

Partners can view the video, PIW – Cisco ACI Upgrade Best Practices (8th June).

Useful Tools To Help You Upgrade

The last bit of good news on this topic is that we’ve released a few useful tools that can add more visibility, pre-checks and guidance.  I’ll share details about three items here.

  1. On our DC App Center Portal, we’ve included an app called the Pre-Upgrade Validator.  This is a free app that you can install and run right on APIC.  It offers an easy and visual way to run a pre-check of various aspects of your fabric against the version of code you are planning to upgrade to.  While not exhaustive, it includes checks for faults and common recommended configurations (like nodes not in a VPC pair).
  2. On Cisco’s Github repository for Datacenter we’ve published the ACI-Pre-Upgrade-Validation-Script. This is a free Python script that you can copy to your APIC and run from the CLI. Don’t worry if you are not familiar with Python, the process is extremely easy and well documented at the link above.  This script is in the same spirit as the visual application from the DC App Center.  However, the script runs a number of added checks and is more frequently updated.  If you have your own Github account, you can even open feature requests for added checks that you want and our developers will consider them.  Both the app and script are fully supported by Cisco.   I prefer the script given it can do a bit more.
  3. Nexus Dashboard Insights (see figure 3) – Firmware Update Analysis feature is one of those useful capabilities of Nexus Dashboard Insights specifically designed to address and care about the many operational details in your environment and where they intersect an upgrade. I’d say this is the most comprehensive tool and recommended if you have Nexus Dashboard Insights deployed in your environment.  It goes a fair bit deeper than the other tools I mentioned because it leverages more of the correlation and machine learning that is at the core of the platform.  It performs detailed checks before and after an upgrade, including a review of available versions with an eye on relevant bugs including links to bug details and release notes.  It records the health, policy, and operational states of your fabric before the upgrade, and then runs an additional delta analysis after the upgrade to see if anything has changed or is not as expected.  If something is amiss, Nexus Dashboard Insights will let you dig in and quickly learn about where, what, when, and even recommendations on how to correct things.
Figure 3: Firmware Update Analysis in Nexus Dashboard Insights

If you want to know more about applications like Nexus Dashboard Insights, this is a good place to start:  https://www.cisco.com/go/nexusinsights

Final Thoughts

Upgrading your ACI Fabric has never been easier.  You can approach an upgrade with intelligence, insight, and a clear plan.  There is no reason not to upgrade to the most recent version you are comfortable with.   You gain features, stability, security and ultimately realize the best return on your investment in Cisco ACI.  Happy upgrading!

Go to Source
Author: Joseph Ezerski

Powered by WPeMatico

Published inUncategorized