Prerequisites for RMA Process

For GR deployment, the node-monitor pods starts automatically. During RMA procedure, the node-monitor pod automatically shutdown the rack if multi-compute failure is detected when the node is drain and deleted.

For more information on RMA (Return Merchandise Authorization), see SMI Cluster RMA section in the Ultra Cloud Core Subscriber Microservices Infrastructure - Operations Guide.

Before starting RMA process, perform the following:

  1. Switch the role for both the instance to other rack using geo switch-role role command and make sure the target rack for RMA is in STANDBY_ERROR role for both the instances.

  2. Disable the node-monitor pod.

    1. Take the backup of daemonsets.

      kubectl get daemonsets node-monitor -n cn -o yaml > node-monitor.yaml

    2. Delete node-monitor pods.

      kubectl delete daemonsets node-monitor -n cn

  3. Continue with RMA procedure. For more information, see the link.

  4. Once RMA procedure is complete, check if the node-monitor pods are already spawned.

    kubectl get pods -n cn -o wide | grep node-monitor

    If the node-monitor pods have not started, restart them.

    kubectl create -f node-monitor.yaml
    Note

    node-monitor.yaml file is same as in Step 2.a.

  5. Correct the role for the instances accordingly.

Note

For both earlier and current SMI versions:

  • If you are replacing hardware components during an RMA procedure that contain firmware, such as an mLOM card, before adding the repaired or replaced node back to the cluster, you must run the HUU (Host Upgrade Utility) to ensure that the component is compatible with the system before syncing the node back into service.

  • As part of RMA, if you remove a node from the cluster and before you return it to the manufacturer, you must purge all data on the device as per instructions provided by the hardware vendor.