The documentation set for this product strives to use bias-free language. For the purposes of this documentation set, bias-free is defined as language that does not imply discrimination based on age, disability, gender, racial identity, ethnic identity, sexual orientation, socioeconomic status, and intersectionality. Exceptions may be present in the documentation due to language that is hardcoded in the user interfaces of the product software, language used based on RFP documentation, or language that is used by a referenced third-party product. Learn more about how Cisco is using Inclusive Language.
This chapter describes procedures used to troubleshoot the data migration feature in the Cisco MDS 9000 Family multilayer directors and fabric switches. This chapter contains the following sections:
Cisco MDS DMM is an intelligent software application that runs on the MSM-18/4 module or MDS 9222i switch. With Cisco MDS DMM, no rewiring or reconfiguration is required for the server, the existing storage, or the SAN fabric. The MSM-18/4 module or MDS 9222i switch can be located anywhere in the fabric, as Cisco MDS DMM operates across the SAN. Data migrations are enabled and disabled by software control from the Cisco Fabric Manager.
Cisco MDS DMM provides a graphical user interface (GUI) (integrated into Fabric Manager) for configuring and executing data migrations. Cisco MDS DMM also provides CLI commands for configuring data migrations and displaying information about data migration jobs.
You can avoid possible problems when using DMM if you follow these best practices:
The DMM feature includes the Array-Specific Library (ASL), which is a database of information about specific storage array products. DMM uses ASL to automatically correlate LUN maps between multipath port pairs.
Use the SLD CLI or GUI output to ensure that your storage devices are ASL classified.
For migration jobs involving active-passive arrays, use the SLD output to verify the mapping of active and passive LUNs to ports. Only ports with active LUNs should be included in migration jobs.
For more information about the SLD tool, refer to the “Checking the Storage ASL Status” section.
Cisco MDS DMM is designed to minimize the dependency on multiple organizations, and is designed to minimize service disruption. However, even with Cisco MDS DMM, data migration is a fairly complex activity. We recommend that you create a plan to ensure a smooth data migration.
Before creating a migration job with the DMM GUI, you need to ensure that server and storage ports are included in enclosures. You need to create enclosures for server ports. If the server has multiple single-port HBAs, all of these ports need to be included in one enclosure. Enclosures for existing and new storage ports are typically created automatically.
Restrictions and recommendations for DMM topology are described in the “DMM Topology Guidelines” section.
When creating a data migration job, you must include all possible server HBA ports that access the LUNs being migrated. This is because all writes to a migrated LUN need to be mirrored to the new storage until the cutover occurs, so that no data writes are lost.
For additional information about selecting ports for server-based jobs, see the “Ports in a Server-Based Job” section.
Each MSM-18/4 module or MDS 9222i switch with Cisco MDS DMM enabled requires a DMM license.
DMM licenses are described in the “Using DMM Software Licenses” section.
Begin troubleshooting DMM issues by checking the troubleshooting checklist in Table 5-1 .
MSM-18/4 module or MDS 9222i switchThe following CLI commands on the MSM-18/4 module or MDS 9222i switch may be useful in troubleshooting DMM issues:
Note You need to connect to the MSM-18/4 module or MDS 9222i switch using the attach module command prior to using the show dmm commands.
This section covers the following topics:
Problems connecting the MSM-18/4 module or MDS 9222i switch can be caused by SSH, zoning, or routing configuration issues. Table 5-2 lists possible solutions.
|
|
|
---|---|---|
Enable SSH on the switch that hosts the MSM-18/4 module or MDS 9222i switch. See “Configuring SSH on the Switch” section. |
||
If VSAN 1 default zoning is denied, ensure that the VSAN 1 interface (supervisor module) and the CPP IP/FC interface have the same zoning. See “Configuring IP Connectivity” section. |
||
Ensure that IPv4 routing is enabled. Use the ip routing command in configuration mode. |
||
Configure the default gateway for the CPP IPFC interface to be the VSAN 1 IP address. See “Configuring IP Connectivity” section. |
Table 5-3 lists possible solutions to problems connecting to the peer MSM-18/4 module or MDS 9222i switch.
|
|
|
---|---|---|
Configure a static route to the peer MSM-18/4 module or MDS 9222i switch. See “Configuring IP Connectivity” section. |
If the DMM SSH connection is generating too many timeout errors, you can change the SSL and SSH timeout values. These properties are stored in the Fabric Manager Server properties file (Cisco Systems/MDS 9000/conf/server.properties). You can edit this file with a text editor, or you can set the properties through the Fabric Manager Web Services GUI, under the Admin tab.
The following server properties are related to DMM:
If you need assistance with troubleshooting an issue, save the output from the relevant show commands.
You must connect to the MSM-18/4 module or MDS 9222i switch to execute DMM show commands. Use the attach module slot command to connect to the MSM-18/4 module or MDS 9222i switch.
The show dmm job command provides useful information for troubleshooting DMM issues. For detailed information about using this command, see the “Cisco DMM CLI Commands” appendix.
Always save the output of the show dmm tech-support command into a file when reporting a DMM problem to the technical support organization.
Capture the output of the show tech-support fc-redirect command on all switches with FC-Redirect entries and save the output into a file.
This section describes the following scenarios:
DMM storage based zone causes the active server-based job to fail.
The Method 2 job that is in progress goes to the failed state if any zone changes are made to the zone entries comprising the NS storage port in the active zone set.
The workaround is to place the optional DMM zone for the particular host and NS into the active zone set before making changes.
If a DMM job is configured and running in a dual fabric, a switch reboot will place the configured DMM job in reset mode and indicate one MSM-18/4 module or MDS 9222i switch as missing in Cisco Fabric Manager.
Even if the switch comes back up, the DMM job will continue to indicate that one MSM-18/4 module or MDS 9222i switch is missing because the switch does not have the information on the DMM job. The DMM job cannot be deleted from Fabric Manager at this point in time.
You have to go to the CLI and explicitly enter the destroy command for that particular job ID to delete the job.
The exception to this rule is if the switch that was rebooted has the information on the DMM job. In such a scenario, Fabric Manager will function normally.
The DMM feature cannot be disabled from the MSM-18/4 module or MDS 9222i switch once the grace period has expired.
Use the poweroff module command and purge the information.
The DMM GUI displays error messages to help you troubleshoot basic configuration mistakes when using the job creation wizards. A list of potential configuration errors is included after the last step in the task.
The following sections describe other issues that may occur during job creation:
If you make a configuration mistake while creating a job, the job creation wizard displays an error message to help you troubleshoot the problem. You need to correct your input before the wizard allows you to proceed.
Table 5-4 lists types of failures that may occur during job creation.
|
|
|
---|---|---|
Ensure that the fabric has an MSM-18/4 module or MDS 9222i switch with DMM enabled and a valid DMM license. |
||
Job infrastructure setup error. Possible causes are incorrect selection of server/storage port pairs, the server and existing storage ports are not zoned, or IP connectivity between the MSM-18/4 modules or MDS 9222i switches is not configured correctly. |
The exact error is displayed in the job activity log. See the “Opening the Job Error Log” section. |
|
Use the SLD command in the CLI to check that the LUNs are being discovered properly. |
To open the job activity log, follow these steps:
Step 1 Drag the wizard window to expose the Data Migration Status command bar.
Step 2 Click the refresh button.
Step 3 Select the job that you are troubleshooting from the list of jobs.
Step 4 Click the Log command to retrieve the job error log.
Note You must retrieve the job activity log before deleting the job.
Step 5 The job information and error strings (if any) for each MSM-18/4 module or MDS 9222i switch are displayed.
Step 6 Click Cancel in the Wizard to delete the job.
If a time-bound license expires, note the following behavior:
If the MSM-18/4 module or MDS 9222i switch or the switch performs a restart, all scheduled DMM jobs are placed in the Reset state. Use the Modify command to restore jobs to the Scheduled state.
To restore each job to the Scheduled state, follow these steps:
Step 1 Select the job to be verified from the job list in the Data Migration Status pane.
Step 2 Click the Modify button in the Data Migration Status tool bar.
You see the Reschedule Job pop-up window.
Step 3 The originally configured values for migration rate and schedule are displayed. Modify the values if required.
The job is automatically validated. If validation is successful, the job transitions into the scheduled state. If you selected the Now radio button, the job starts immediately.
This section helps you troubleshoot an error when the new storage is smaller in size than the existing storage. the DMM configuration wizard allows you to configure sessions for the data migration job. The wizard displays a default session configuration. If any session is marked in red it implies that the session LUN in the new storage is smaller in size than the session LUN in the existing storage.
Failures During Sessions CreationAlthough the LUN values displayed in the wizard are identical, the displayed LUN value in Gigabytes (GB) is rounded off to the third decimal.
The actual size of the LUNs can be verified using the show commands on the SSM CLI. To verify the size of the size of the LUNs, follow these steps:
Step 1 Note the host pWWN, existing storage pWWN and the new storage pWWN as displayed on the wizard screen.
Step 2 Note the MSM-18/4 module or the MDS 9222i switch information displayed on the wizard screen.
Step 3 From the switch console, enter the attach module command to attach to the MSM-18/4 module or MDS 9222i switch console.
Step 4 Enter the show dmm job command from the SSM CLI to display the job information. The following example shows the job information:
Step 5 Enter the show dmm job job-id details command to display the job details.
Step 6 Look for server information in the output and note the VI pWWN corresponding to the host port. The following example shows server information:
Step 7 Using the storage pWWN and the VI pWWN, enter the show dmm job job-id storage tgt-pww vi-pwnn command to get the LUN information for the existing and new storage. The following example shows the output of the existing storage. Note the Max LBA and Size values.
The following example shows the output of the new storage. Note that the LBA and Size values are smaller than the comparable values in the existing storage.
Step 8 Correct the LUN size of the new storage so that it matches the LUN size of the existing storage, and then reconfigure the job.
This section helps you troubleshoot an error when the job destroy command displays an error.
The following example shows the failure that may occur during job destruction:
If the job destroy command displays an error, there is a possibility that the job is still in progress and has not stopped. You can enter the job destroy command again to destroy the job completely.
If a failure occurs during the execution of a data migration job, DMM halts the migration job and the job is placed in the Failed or Reset state.
The data migration job needs to be validated before restarting it. If the DMM job is in the Reset state, FC-Redirect entries are removed. In the DMM GUI, validation occurs automatically when you restart the job. In the CLI, you must be in the Reset state to validate the job. You cannot validate the job in a failed state.
Note If a new port becomes active in the same zone where a migration job is in progress, DMM generates a warning message in the system logs.
Troubleshooting job execution failures is described in the following sections:
If DMM encounters an SSM I/O error to the storage, the job is placed in the Failed state. Table 5-5 lists possible solutions for jobs in the Failed state.
Table 5-6 lists possible causes and solutions for jobs in the Reset state.
If DMM encounters an error while running the job creation wizard, a popup window displays the error reason code. Error reason codes are also captured in the Job Activity Log. Table 5-7 provides a description of the error codes.