Guest

CiscoWorks Resource Manager Essentials

RME Advanced Troubleshooting Guide

Cisco - RME Advanced Troubleshooting Guide

Document ID: 111173

Updated: Jul 27, 2010

   Print

Introduction

This document provides information on how to troubleshoot the issues that you encounter in CiscoWorks Resource Manager Essentials (RME).

Prerequisites

Requirements

There are no specific requirements for this document.

Components Used

The information in this document is based on these software versions:

  • Resource Manager Essentials 4.1

  • Resource Manager Essentials 4.2

Conventions

Refer to Cisco Technical Tips Conventions for more information on document conventions.

RME Inventory Collection Failed

Inventory, or Inventory collection service (ICS), collects inventory data from the network devices and keeps the inventory updated. It is controlled by the ICServer daemon.

Solution

If you find any issues with inventory collection, any one of the solutions described in this section can help you resolve the problem.

RME Inventory collection fails on 4507 routers

If RME failed to collect inventory on 4507 routers, you can see SNMP timeout exceptions 'com.cisco.nm.lib.snmp.futureapi.SnmpReqTimeoutException' in the IC_Server.log file, which is available under NMSROOT\log in Windows and /var/adm/CSCOpx/logs in Solaris.

Note: NMSROOT is the CiscoWorks install directory. The default location on the Windows platform would be C:\Program Files\CSCOpx.

In order to resolve this issue, you can increase the SNMP time out for this device.

RME Inventory collection fails on C6500 and 7600 router

You see this exception in the IC_Server.log file:'com.cisco.nm.rmeng.inventory.ics.server.InvDataProcessor,448,ASA Error -193: Primary key for table 'PhysicalElement' is not unique'

You see this exception when RME does not create PhysicalElement entries for ports, as it assumes that a port will not contain any other entities. The result is that the entities contained by this port will be added to the PhysicalElement table, but their parent port will not be added. The bug CSCso54489 (registered customers only) is available for this issue.

In order to fix this issue, get the patch for the CSCso54489 bug from the Technical Support team or you can upgrade to LMS (LAN Management Solution) 3.2.

RME Inventory collection fails on VSS running 12.2(33)SXH

You see this exception in the IC_Server.log file:java.lang.reflect.InvocationTargetException at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

This occurs because the sysDescr format has changed in newer versions of Cisco IOS®. The bug CSCsv65621 (registered customers only) is available for this issue.

In order to fix this issue, get the patch for the CSCsv65621 bug from the Technical Support team or you can upgrade to LMS 3.2.

RME Archive Management Failed

The Archive Management application maintains an active archive of the configuration of devices managed by RME. It provides the ability to fetch, archive, and deploy the device configurations. It is controlled by the ConfigMgmtServer daemon.

Solution

If you find any issues with archive management jobs, any one of the solutions described in this section can help you resolve the problem.

Time out and Retry Values

You can increase SNMP retry, SNMP timeout, Telnet timeout, or TFTP timeout values. In order to increase the values:

  1. Launch the CiscoWorks application.

  2. Invoke:

    RME Admin -> System Preferences -> RME Device Attributes
  3. Set the values for those protocols.

CiscoWork server is in NAT

If the CiscoWorks server is inside the NAT environment, configure the Natted RME IP address.

Note: In order to confirm that the issue is with the RME, connect manually to the devices for the failure jobs with the same protocol (Telnet/SSH) from CiscoWorks Server. If you are able to connect, you can report the issue to the Technical Support team. If not, the issue resides with the device.

Archive report is partially successful due to Vlan fetch fails

You can see %Error opening flash:vlan.dat in dcmaservice.log, which is available under NMSROOT/log in Windows and /var/adm/CSCOpx/logs in Solaris.

This issue results from the vlan.dat file not being copied to the server from the device, as it is not availble in the device. The result is that the archive job is partially successful.

In order to resolve this issue, create an empty vlan.dat file or bogus vlan.dat with some data in the device and recreate the job.

Sync Archive Job Hangs

SSHv2 operations might go into an infinite (or very long) loop while waiting for the result of an authentication. This causes SSHv2 operations to lock up, preventing other operations from completing, and the sync archive job hangs.

You can view the CSCsw88378 (registered customers only) bug for more details.

In order to fix this issue, get the patch for CSCsw88378 bug from the Technical Support team or you can upgrade to LMS 3.2.

Software Image Management (SWIM) Failed

The Software Management application automates the steps associated with upgrade planning, scheduling, downloading software images, and monitoring the network. It is not controlled by any daemon, and jobs run directly out of JRM.

Solution

If you find any issues with SWIM, any one of the solutions described in this section can help you resolve the problem.

TFTP (Trivial File Tranfer Protocol) Issues

If the issue is with TFTP, you can follow these troubleshooting steps:

  1. Verify that the TFTP service is running properly. In order to verify, you can run the /opt/CSCOpx/bin/mping -s tftp localhost command in the Solaris platform and ensure it is visible in the control panel of the Windows platform.

  2. Verify that the 'tftp' directory has proper permissions. If not, provide full permissions.

  3. Check to see if there are any firewall blocks for the TFTP and UDP protocols or ports. If there are, remove them in order for SWIM to function properly.

  4. Verify that you can do a tftp copy manually. If it fails consistently, then check for network issues. In order to check for any network issues, start a manual tftp copy and issue the ping -t ip-add command in parallel using another console and check for any dropped packets. If any packets are dropped, you might have to resolve the network issues.

  5. If the issue persists, you can use a sniffer trace to analyze the issue further.

RCP (Remote Copy Protocol) Issues

If you encounter any issues with RCP in SWIM, you can follow these troubleshooting steps:

  1. Verify that the RSH service is running.

  2. Verify that cwuser (or any other user) is configured correctly in Common Services > Server > Admin > System preferences.

  3. If you are using Solaris, verify that the user exists in the '.rhosts' file with cwuser:casusers permissions.

  4. Verify that RCP is configured on the device. For configuring RCP on CiscoWorks, refer to Enabling rcp.

  5. Verify that you can copy manually. If it fails consistently, then check for network issues. In order to check for any network issues, start a manual tftp copy and issue the ping -t ip-add command in parallel using another console and check for any dropped packets. If any packets are dropped, you might have to resolve the network issues.

  6. If you still encounter any issues with rcp, use a sniffer trace to analyze the issue further.

SCP (Secure Copy Protocol) Issues

Be aware of the prerequisites necessary for SCP to work well in RME:

  1. Ensure that you have enabled SSH on the device and have the credentials available in DCR.

  2. The SCP server should be enabled on the device.

  3. Note that SWIM acts as an SCP client (different from other protocols).

  4. For configuring SCP on CiscoWorks, refer to Enabling scp.

NAT (Network Address Translation) Issues

If you see the connectivity issues in the swim_debug.log, which is available under NMSROOT/log in Windows and /var/adm/CSCOpx/logs in Solaris, then the issue might be belong to NAT. You can follow the steps in order to troubleshoot.

  1. SWIM uses “RME ID” while image copy in the NAT environment, so make sure to configure the global RME ID.

  2. For a device already in RME before the NAT is set up or a group of devices which are outside the NAT boundary, set the RME ID for each device separately.

  3. In order to configure the RME ID, refer to Managing Devices When RME Server is Within a NAT Boundary.

  4. If the problem persists, use a sniffer trace to analyze the issue further.

Issues with Syslog

The Syslog Analyzer application, along with the syslog collector, allows you to centrally log and track syslog messages (error, exception, information, etc.) sent by devices in the network.

If you find any issues with syslog in RME, any one of the solutions described in this section can help you to resolve the problem.

Solution

Syslog service not running

You see this exception: java.lang.NoClassDefFoundError: org/apache/oro/text/regex/MalformedPatternException in syslog_debug.log which is available under NMSROOT/log in windows and /var/adm/CSCOpx/logs in Solaris.

This issue can be seen when the CNC agent installation removes jakarta-oro-2.0.6.jar.

In order to resolve this issue, you can copy the jakarta-oro-2.0.6.jar file in <NMSROOT>/lib/classpath and start the syslog service again.

SyslogCollector does not work

When syslog collector stops collecting the data you can check these:

  • Verify that the devices are configured to send the syslog with proper IP address.

  • Verify that the syslog collector service is running.

  • Verify that your message filters are configured properly.

Syslog Analyzer cannot connect to Remote Syslog Collector

The connection issue might be seen if TCP ports 3333 and 4444, used by the syslog collector, are blocked.

In order to resolve the issue, you can release TCP ports 3333 and 4444 and restart the syslog service.

Syslog report shows as empty

If you see an empty report, try to configure the log rotation with the logrot tool provided by CiscoWorks.

Issues with Database

Before troubleshooting any RME database issues, familiarize yourself with the RME database details mentioned here:

  • RME Database name (dsn): rmeng

  • Daemon Management prefix (dmprefix): RME

  • Database login User Name: DBA

  • Database password generates randomly at the time of RME installation. In order to change the database password, you may have to run the NMSROOT/bin/dbpasswd.pl utility.

Note: Usage of the NMSROOT/bin/dbpasswd.pl utility:

NMSROOT/bin/dbpasswd.pl {all | dsn=data source [opwd=old password]
[pfile=properties file] | listdsn}

Solution

RME DB not running

You can see RMEDbEngine process is down in the output of pdshow or you may see the java.io.IOException exception in dcmaservice.log.

This is beacuse there is not enough space on the disk (database size becomes too large).

In order to resolve this issue, try to purge the old data or restore with known good LMS backup data. For more information, refer to Performing a Forced Purge.

Related Information

Updated: Jul 27, 2010
Document ID: 111173