Published: April 2020
It’s tricky managing infrastructure in remote sites where we don’t have IT staff. Essential ingredients are ultra-reliable devices and fast deployment so engineers can get in and out quickly.
Until recently, in remote sites we deployed core services (DNS, DHCP, enterprise management) on standalone Cisco rack servers with VMware ESXi and local storage. Capacity management was a challenge, slowing performance. Reliability wasn’t as good as we’d like because of single points of failure. And deployment took about eight hours—longer than we’d like.
We found our solution for remote site infrastructure and certain production use cases in Cisco HyperFlex™ Edge and Cisco Intersight™, a cloud management service. As Customer Zero for Cisco HyperFlex Edge and Intersight we’re the first to try out new features. That allows us to report bugs to the business unit, suggest product enhancements, and share use cases and lessons learned with customers—as we’re doing here.
When Cisco HyperFlex Edge hit the scene in 2018, we realized it would meet an immediate need for highly available backup DNS servers in Amsterdam. HyperFlex combines compute, storage, and hypervisor (VMware ESXi) in one package. “Deploying a 3-node HyperFlex system is 400% faster with Intersight—two hours instead of eight,” says Srikanth Makineni, senior infrastructure engineer. Read about our initial deployment here.
By 2019, we’d deployed HyperFlex clusters around the world for the following use cases:
In 2019 we started managing HyperFlex clusters with Cisco Intersight. “With Intersight we have one pane of glass to monitor and manage any kind of server, anywhere—whether it’s a HyperFlex in an Amsterdam sales office or a blade server in San Jose,” says Joe DeSanto, member of technical staff for our infrastructure strategy team.
As Customer Zero for HyperFlex and Intersight, we asked the business unit to increase the number of nodes that can be deployed at once from three to 20. That saved us two business days when we deployed the 13- and 7-node clusters in San Jose.
Instead of installing HyperFlex software from an Open Virtual Appliance (OVA) image, we use Intersight to build HyperFlex profiles with the required policies. When it’s time to deploy a new HyperFlex cluster, we can clone an existing profile to reuse its policies, editing the cloned profile if necessary. “Having the ability to pre-stage profiles in Intersight is huge,” DeSanto says. “Cloning profiles and reusing policies cuts installation time by 60%-75%.”
Deploying our first 7-node cluster took 8-10 hours, much of it spent manually uploading bootstrap files. We asked the business unit for a faster way. Realizing that anything we request will also be of value to customers, the business unit made automated deployment a top priority.
Now we can upgrade HyperFlex software and VMware ESXi with one click on the Intersight dashboard. The time to upgrade software dropped from 8-10 hours to 2-3 hours—and the engineer can walk away after initiating the process. “Automated upgrades are a great example of how our role as Customer Zero helps out other Cisco customers,” DeSanto says. “We partner with the product engineers to recommend features and also to iron out issues—so our customers don’t have to.”
Another way Intersight speeds up upgrades is by making it simpler to check whether new drivers are compatible with different operating systems. Imagine we’re updating HyperFlex software and VMware ESXi. Before we had Intersight, to check driver compatibility we’d go to the UCS Hardware and Software compatibility matrix to manually select the server type, model, processor, and operating system version. The program would generate a hardware certification list (HCL) that we’d export to a spreadsheet or PDF. Then we’d either write a script or look at a dashboard to find which servers were compatible.
Now we can see recommended drivers for each server right on the Intersight dashboard, and download them with a click. “The automatically generated HCL cuts the time to find out about compatibility issues by about 15%,” Makineni says.
Say a deployment or upgrade fails. Before we had Intersight, we'd first email TAC to find out if the server had an active contract. If the server had an active contract, we had to generate log files to attach to the case, which took about 2-3 hours for a 7-node cluster.
Now we can see right on the Intersight dashboard whether each server is under contract, and open a case with a click, because Intersight connects to the TAC database. TAC automatically has access to the data they need, so we no longer have to generate or attach files to open a case. This feature is called Connected TAC.
We can now see all global inventory— HyperFlex clusters as well as UCS blade and rack servers—on a single pane of glass. The consolidated view is especially helpful when we discover a new security vulnerability because we can quickly see all affected servers of any type.
Our plans include:
To read additional Cisco IT business solution case studies, visit Cisco on Cisco: Inside Cisco IT.