Guest

CiscoWorks Device Fault Manager

CiscoWorks Device Fault Manager (DFM) FAQ

Document ID: 110144

Updated: Apr 10, 2009

   Print

Introduction

This document provides answers and information on the most frequently asked questions (FAQ) about the CiscoWorks Device Fault Manager (DFM).

Refer to Cisco Technical Tips Conventions for more information on document conventions.

General Questions

Q. Why is there a delayed response in DFM?

A. DFM depends on the third party tool called Incharge Server. The Device details from DFM are sent to an Incharge Server. The Incharge Server polls the device and sends back the alert information to EPM DataBase. The synchronization between DFM and Incharge Server takes approximately 10 seconds. Therefore, a slow response of 10 seconds is a known issue in DFM.

The Incharge Server performs these actions:

  • Polling for every 4 minutes

  • Discovery of Devices

  • Trap forwarding to DFM

When the ethreal trace/snoop is enabled from CiscoWorks server to the device, you can make sure that the Incharge Server is working properly.

Q. How can DFM identify the interface duplex mode if it is not set?

A. If DFM cannot correctly determine the duplex mode (because it was not set manually nor was it set in the MIB), DFM sets DuplexSource to be assumed and performs the following:

  • If the interface is a 10-MB Ethernet interface, DFM assumes the setting is half duplex (DFM considers an interface to be 10-MB Ethernet when its Type="*ETHER*" and its MaxSpeed=10000000).

  • For all other interfaces, DFM assumes the setting is full duplex.

Q. How does DFM detect the trunk or access port?

A. In DFM, by default all ports are set to access ports when they are discovered. DFM labels the PortType as Access if that port is connected to a system interface.

In order to detect a trunk port in DFM, the device (for example, A) which has trunk port and the neighbor device (for example, B) which has trunk port, are connected and formed Trunk. Then, both devices should be addded to DFM.

If the port uses Gigabit Interface Converter, which allows interconnectivity between switches on Gigabit Ethernet Port in the device, Smarts cannot build any 1- connection between ports and one Gigabit Interface port might be connected to multiple ports on other switches. Therefore, even though the ports might be trunk ports, they are labelled as Access ports in DFM.

In order to verify the PortType in Incharge, check the NeighboringSystems attribute under the port. This attribute shows you whether the port is connected to a switch, router, or host interface. If the discovered port has no connection, then the default PortType is Access.

Issue this command from dmctl to find the NeighboringSystem attribute:

dmctl> get Port::PORT

Q. What is the difference between SNMP Raw Trap Forwarding and SNMP Trap alert/event Trap Forwarding? Does DFM support both?

A. You can configure raw trap forwarding at DFM > Other configuration > SNMP Trap forwarding, and processed event/alert trap forwarding at DFM > Notification Services > SNMP Trap Forwarding.

Processed trap is "when DFM receives certain SNMP traps, it analyzes the data found in fields (Enterprise/Generic trap identifier/Specific Trap identifier/variable-bindings) of each SNMP trap message, and changes the property value of the object property (if required)". Raw trap is the trap that the device forwards to DFM and DFM has yet to process it.

For more information, refer to the DFM User Guide.

Yes, DFM supports both ways of trap forwarding.

Q. How do I generate Port/Interface Flapping event in DFM?

A. Port/Interface flapping is generated due to a certain number of linkUp and linkDown traps sent to the device, at continuous intervals.

By default, the trap count is 3 and time interval is 300 seconds.

Q. After reboot of the machine all the custom Polling and Threshold Manager (PTM) settings and Interface management settings are getting lost. Why?

A. Before you reboot, issue the net stop crmdmgtd command. This forces all of the changes to be persisted in the DFM repository files. Simply rebooting can result in lost data because Windows does not wait long enough for dmgtd to stop before cycling the machine.

Q. Why do the selected devices (in search results) not appear as selected other than the All Devices group?

A. This is the default behavior of HOSTree.

The device is selected only in the AllDevices list, and NOT in all the places where the device is listed.

Q. Why do devices in the Device Credential Repository (DCR) not get added to DFM in Access Control Server (ACS) mode?

A. For the devices to get added into DFM in ACS mode, you need to create a Network Device Group in the ACS server and add all the devices that you want to be managed by DFM.

These are the steps to add devices in DFM in ACS mode:

  1. By default, Network Device Groups are not displayed under Network Configurations.

    • Go to Interface Configurations > Advanced Options.

    • Select Network Device Groups.

  2. Select Network Configurations > Add Entry > name the group (for example, DFM_Devices). A new group named DFM_Devices is shown now.

  3. Click the DFM_Devices, and entry for devices under DFM_Devices AAA client.

  4. Add entry for AAA server for DFM_Devices (default localhost IP).

  5. Click SUBMIT+RESTART.

  6. Add those devices to DFM through DCR. Now the devices are managed by DFM.

Q. What are the Management Information Base (MIBs) polled by DFM for getting card status?

A. In order to monitor the cards, DFM polls these MIBs to get the card status:

  • EntityFRU

  • OLD-CISCO-CHASSIS-MIB

  • CISCO-STACK-MIB

Troubleshooting

Q. How do I troubleshoot if DFM polls the device with Old Simple Network Management Protocol (SNMP) Read Only community string?

A. While DFM is managing a device, the SNMP Read Only (RO) community string was changed or removed in the device by someone. DFM polls the device with old RO community string. Because the community strings are wrong, the device is stuck at 10% or it goes to question state. You might see this message in the show log command in the device:

 % SNMP-3-AUTHFAIL: Authentication failure for SNMP req from host
	 10.3.223.102
 % SNMP-3-AUTHFAIL: Authentication failure for SNMP req from host
	 10.3.223.102
 % SNMP-3-AUTHFAIL: Authentication failure for SNMP req from host
	 10.3.223.102

Note: 10.3.223.102 is the CiscoWorks server IP address.

This error occurs due to DFM.rps file corruption. In order to troubleshoot this, complete these steps:

  1. Delete the DFM.rps and DFM1.rps from the location NMSRoot/objects/smarts/local/repos/icf.

  2. Re-initialize the DFM database.

  3. Re-add the device from DCR into DFM and rediscover the device.

Note: NMSRoot is the CiscoWorks installed location.

Q. DFM daemons are not fully up or down and most of the UIs are throwing errors. Why?

A. Check the pdshow output and netstat -a -n -o -b to check if there is any port conflict.

Also, check if there is ekrn.exe (security related application) which might be interferring with LMS. If so, stop this service and restart the CiscoWorks daemons. This solution can solve the issue.

Q. In DFM all the devices are in known state but when a device in the Detailed Device View (DDV) page is selected, DFM displays the objects (for example, power supply, fan, interface) and its' instance name, but says status is not available. Also, there are no alerts from any device. Why?

A. Check if monitoring at Polling and Threshold UI is disabled for all groups or for your problematic group.

If so, enable monitoring and apply the changes. Then you can see the values in DDV and alerts at AAD.

Q. Why does the Network Error (tcp_error) occur while launching the Fault History report for Group/Device Filtering or Alert/Event ID Filtering?

A. While launching Fault History report for Group/Device Filtering or Alert/Event ID Filtering, if the Fault History report page shows a white screen or this error:

"Network Error (tcp_error) A communication error
	 occurred: The Web Server may be down, too busy, or experiencing other problems
	 preventing it from responding to requests. You may wish to try again at a later
	 time. For assistance, contact your network support team."

Then, check whether softwares HP Insight Server and HPInsight NIC Agent are installed. If yes, then perform these steps:

  1. Go to Control Panel > Administrative Tools > Services and stop the services of these softwares (HP Insight Server and HPInsight NIC Agent ).

  2. Stop and start the DFM dfmFh Database Engine services.

  3. Stop and start the FHServer, FHDbEngine process by issuing these commands from CLI:

    C:> pdterm FHServer FHDbEngine
    
    C:> pdexec FHServer FHDbEngine
    
  4. Launch the Fault History Report.

Q. Why does the server record the ID 16640 Error at the same time in Event Viewer?

A. The "system logging facility" on Windows is the Event Logger. The Windows interface to the Event Logger requires you to specify an Event ID. Smarts defines the Event ID EVLOG_GENERIC (which has value of 16640). An Event ID has a string containing descriptive text and places to substitute arguments in it. EVLOG specifies that there will be a single argument which will substitute for the entire message. What Smarts substitutes is the entire, formatted text of the message, just as it would appear in the log file.

16640 events from Smarts in Windows are general. Basically, any error logged by DFM which is of high severity will also be logged to the Windows event viewer with code 16640. The error message is about an inconsistency in the topology being corrected during auto discovery.

While processing, a call was made to “oczcb110” which at that time was not in the topology (resulting in "call on a NULL repository object"). Furthermore, the same object was being managed by 2 separate DFM domain managers; therefore resulted in the same event being logged twice at the same time.

These events are only of interest if they occur at a time when there are observable symptoms to investigate in relation to a specific issue (for example, just as any log file is of interest when investigating certain behavior, otherwise the logs are not usually of any interest by themselves).

Q. Why is HTTP 500 thrown in DDV when the SNMP location or any string contains any reserved (special) characters?

A. This occurs when the device SNMP location string is configured and contains reserved characters (for example, a pipe symbol (| or ^) characters).

Reason: Since internal logic has used "|" and "^" characters for parsing, the string value contains any of these characters. Then the user might hit this 500 error.

The solution or suggestion:

It is better to not to use the special character "|" and "^" for any of the device values.

Device Discovery

Q. What is the meaning of different discovery percentages?

  • 10% - Is startup, devices have not been handed over to incharge processes yet.

  • 40% - Devices have been successfully handed over to incharge processes.

  • 70% - Incharge processes have successfully discovered devices and handed the information over to Cisco code.

  • 90% - Discovered devices need to be placed in appropriate groups. Device information has been sent to OGS and group information from OGS is awaited.

Q. How can I delete the device from DFM if the device is in pending state for a long time?

A. In DFM, if the device is deleted when it is in learning state then it moves to pending state and sometimes it does not delete.

In order to delete the device that has been in pending state for a long time, issue these commands from CLI:

C:>pdterm InventoryCollector InventoryCollector1 TISServer
	 DFMOGSServer
C:>pdexec InventoryCollector InventoryCollector1 TISServer
	 DFMOGSServer Interactor Interactor1 NOSServer PTMServer

Q. Is there a way to unmanage IP addresses of interfaces when all devices are inserted into DFM?

A. No, DFM does not unmanage the IP addresses when devices are inserted into DFM.

In order to unmanage the IP addresses for a device, complete these steps:

  1. Go to DFM > Device Management > Device Details. Select the device to view in the left pane tree of the DDV window that contains IP under Interface.

  2. Select the IP. The right pane shows the IP address and managed state values. Change the managed state value to False.

CLI Commands

Q. What is the command to know the SNMP Time Out and SNMP Retries values from CLI?

NMSRoot/objects/smarts/bin/dmctl -s DFM get ICF_TopologyManager::
ICF-TopologyManager::defaultTimeout

!--- For SNMP Time Out

28000
NMSRoot/objects/smarts/bin/dmctl -s DFM get ICF_TopologyManager::
ICF-TopologyManager::defaultRetries

!--- For SNMP Retries

4

Note: NMSRoot is CiscoWorks installed location. The default location is C:\Program Files\CSCOpx in Windows and /opt/CSCOpx in Solaris.

Q. What is the command to get the underlying MAC address and interface name for an IP component?

A. LayeredOver gives you the details of the MAC address and Interface Name for an IP.

This is an example:

dmctl> get IP::IP-10.77.209.134::LayeredOver
{MAC::MAC-00-1A-A1-48-83-D6 Interface::IF-10.77.209.134/1}

Q. How is rediscovery of devices in DFM through CLI performed ?

A. You can use this command to rediscover all devices:

dmctl -s DFM invoke ICF_TopologyManager::
ICF-TopologyManager discoverAll

For specified devices, use the followinf command:

dmctl -s DFM invoke ICF_TopologyManager::ICF-TopologyManager ::

Q. How can I manage and unmanage Port/Interface from CLI?

dmctl -s=DFM invoke class::instance op [arg1 ...]

A. For example,

dmctl> invoke Port::PORT-5.1.2.2/10123 manage

dmctl> invoke Interface::IF-5.1.3.2/1 manage

dmctl> invoke Port::PORT-5.1.2.2/10123 unmanage

Q. How are devices imported into DCR through CLI?

A. Run the dcrcli command with the CiscoWorks username as argument.

dcrcli -u admin

You are prompted for a password. Give the CiscoWorks password.

At the dcrcli prompt, issue this command:

dcrcli>impFile fn="path to csv import file" ft=csv

Q. How do you collect Mibwalk for a device?

  1. Navigate to <NMSROOT>/objects/smarts/bin.

  2. Enter the command. Solaris: ./sm_snmpwalk --community= deviceIp, Windows: sm_snmpwalk --community= deviceIp

    This is an example:

    ./sm_snmpwalk --community=cisco 4.1.1.1
    
    !--- For Solaris
    
    
    sm_snmpwalk --community=cisco 4.1.1.1
    
    !--- For Windows
    
    
  3. The above command generates three files: xxxxx.walk, xxxxx.mimic and xxxxx.snap files [where xxxxx is the device ip] in the same location, for example <NMSROOT>/objects/smarts/bin.

  4. You might want to zip the three generated files.

Q. What are the environment variables that are set in Windows?

LM_LICENSE_FILE=C:PROGRA~1CSCOpxobjectssmartsconftrial.dat.DFM
LOGONSERVER=dfm-pc1
SM_BROKER=dfm-pc1:9002
SM_HOME=C:PROGRA~1CSCOpxobjectssmarts
SM_RULESET_PATH=C:PROGRA~1CSCOpxobjectssmartsrules
SM_SITEMOD=C:PROGRA~1CSCOpxobjectssmartslocal;C:PROGRA~1CSCOpxobjectssmarts
SM_SNMP_BUG_COMPATIBLE=1

Limitations

Q. Does DFM support a Network Address Translation (NAT) environment?

A. No. It is not supported and not recommended to use DFM in a NAT environment.

Q. Can I install HP Open View or Netview in one drive (for example, C:) and DFM HPOV/Netview Adapters in another drive (for example, D:)?

A. No. It is recommended to install both HPOV/Netview and DFM-HPOV/Netview adapters in the same drive.

Related Information

Updated: Apr 10, 2009
Document ID: 110144