Overview
This topic explains best practices for cluster management, focusing on maintaining controller health, ensuring firmware consistency, and safeguarding data integrity during configuration, maintenance, and scaling operations to prevent data loss and cluster instability.
Best practices for cluster management help ensure cluster health, stability, and data integrity during configuration, maintenance, and scaling operations.
-
Cluster management best practices are guidelines to maintain controller health, firmware consistency, and data integrity.
-
They include verifying controller health, ensuring firmware version uniformity, and following correct procedures for adding, moving, or decommissioning controllers.
-
Proper adherence prevents data loss, cluster instability, and unsupported operations.
Key best practices for cluster management
-
Always verify the health of all controllers before making any changes to the cluster. Confirm that every controller is fully fit and resolve any health issues before proceeding.
Ensure that all controllers in the cluster run the same firmware version before adding, configuring, or clustering devices. Do not cluster controllers running different firmware versions.
-
Maintain at least three active controllers in your cluster, and add standby controllers as needed. For scalability requirements, consult the Verified Scalability Guide to determine the required number of active controllers for your deployment.
-
Ignore cluster information from controllers that are not currently active in the cluster, as their data may be inaccurate.
Once you configure a cluster slot with a controller’s ChassisID, you must decommission that controller to make the slot available for reassignment.
Wait for all ongoing firmware upgrades to complete and verify the cluster is fully fit before making additional changes.
-
When moving a controller, always ensure the cluster is healthy. Select the controller you intend to move, shut it down, physically move and reconnect it, and then power it on. After the move, verify through the management interface that all controllers return to a fully fit state.
-
Move only one controller at a time to maintain cluster stability.
-
When transferring a controller to a different set of leaf switches or to a different port within the same leaf switch, ensure the cluster is healthy first. Decommission the controller before moving it, and then recommission it after the move.
-
Before configuring the cluster, confirm that all controllers run the same firmware version to prevent unsupported operations and cluster issues.
Delete any unused OOB EPGs associated with a controller. Assigning multiple EPGs to a controller is not supported and can cause the cluster workflow IP address to be overridden by policy.
-
Log record objects are stored only in one shard on a single controller. If you decommission or replace that controller, those logs are permanently lost.
-
When decommissioning a controller, all fault, event, and audit log history stored on it is deleted. If you replace all controllers, all log history is lost. Before migrating a controller, manually back up its log history to prevent data loss.