Guest

Support

Cisco ACI Software Advisory Notice for CSCus39019

Cisco ACI Software Advisory Notice for CSCus39019

First Published: January 9, 2015

NOTE: Works with document�s Advanced Properties �First Published� property. Click File | Properties | Advanced Properties | Custom.

Last Updated: January 12, 2015

 

NOTE: Available paragraph styles are listed in the Quick Styles Gallery in the Styles group on the Home tab. Alternatively, they can be accessed via the Styles window (press Alt + Ctrl + Shift + S).

Dear Cisco Customer,

Cisco engineering has identified a software issue with the software release detailed below.

Affected Software

Releases earlier than 1.0(3).

Affected Platforms

Cisco ACI-mode switches

Symptom

The event manager process on a spine switch crashes and is consuming a high percentage of CPU resources after a restart attempt.  A fault is generated for the event-mgr core. Traffic is not affected, but the APIC GUI and spine switch CLI could be unresponsive.

Reason for Advisory

CDETS number CSCus39019  (Internal MO leak in eventmgr store on switch)

Description

When faultable MOs are created, internal objects are created in the eventmgr data store.
 If faultable MOs are created without faults, the internal objects are not released properly. On a long-running system with new MOs being constantly created and deleted (for example, MOs representing results of diagnostic tests running in the background) the eventmgr data store can eventually fill up, rendering eventmgr inoperable. The common symptom is an eventmgr crash followed by a restart of eventmgr in a continuously busy state (consuming a high percentage of CPU resources).

A user with administrative privileges can monitor the current size of the data store using the following switch CLI command:

ls -l /dev/shm/lpssmu/ifc_eventmgr-1_ud1

An eventmgr crash can occur when the data store file size reaches approximately 1 GB.

Workaround

Reduce the frequency of the diagnostics tests on spine switches. This can be done using either the API or CLI as shown in the following examples:

API

http POST to URL:    https://<APIC IP address or hostname>/api/node/mo/uni/fabric.xml

<monFabricPol name="default">

 

<monFabricTarget scope="eqptFC">

   <eqptdiagpSpTsHlFc name="default" freq="every1day" /> </monFabricTarget>

 

<monFabricTarget scope="eqptLC">

   <eqptdiagpSpTsHlLc name="default" freq="every1day" /> </monFabricTarget>

 

<monFabricTarget scope="eqptSupC">

   <eqptdiagpSpTsHlSc name="default" freq="every1day" /> </monFabricTarget>

 

<monFabricTarget scope="eqptSysC">

   <eqptdiagpSpTsHlScc name="default" freq="every1day" /> </monFabricTarget>

 

</monFabricPol>

CLI

Log in as administrator

switch# cd /aci/fabric/fabric-policies/monitoring-policies/monitoring-policy-default/diagnostics-policies/

switch# cd line-module-\(eqpt.lc\)/eqptdiagp-sptshllc-default

switch# moset health-diag-test-frequency every-1-day

switch# cd ../..

switch# cd supervisor-module-\(eqpt.supc\)/eqptdiagp-sptshlsc-default

switch# moset health-diag-test-frequency every-1-day

switch# cd ../..

switch# cd fabric-module-\(eqpt.fc\)/eqptdiagp-sptshlfc-default/

switch# moset health-diag-test-frequency every-1-day

switch# cd ../..

switch# cd system-controller-module-\(eqpt.sysc\)/eqptdiagp-sptshlscc-default/

switch# moset health-diag-test-frequency every-1-day

switch# cd ../..

switch# moconfig commit

 Validation

Before the workaround, the size of the file /dev/shm/lpssmu/ifc_eventmgr-1_ud1 would be growing periodically. After the workaround, the size of this file would not change or would change much more slowly than before.

Notes

1.     If a spine switch is already in the state where eventmg process has restarted and is consuming close to 100% of a CPU core resource on a spine switch, perform a supervisor switchover to bring down the size of the data store file. On the switch, enter the reload module module-number command, where module-number is the number of the active supervisor. After the switchover, apply the workaround.

2.     For 1.0(1x) images, the workaround must be reapplied after every reload of the spine switch.

3.     The workaround documented here is for the default monitoring policy. If a custom monitoring policy is configured for diagnostic tests, the corresponding workaround for such custom policies should be applied as well.

4.     A fix for this bug will be available in the second half of February 2015, in 1.0(3) and later images.

 

 

Legal Information

THE SPECIFICATIONS AND INFORMATION REGARDING THE PRODUCTS IN THIS MANUAL ARE SUBJECT TO CHANGE WITHOUT NOTICE. ALL STATEMENTS, INFORMATION, AND RECOMMENDATIONS IN THIS MANUAL ARE BELIEVED TO BE ACCURATE BUT ARE PRESENTED WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED. USERS MUST TAKE FULL RESPONSIBILITY FOR THEIR APPLICATION OF ANY PRODUCTS.

THE SOFTWARE LICENSE AND LIMITED WARRANTY FOR THE ACCOMPANYING PRODUCT ARE SET FORTH IN THE INFORMATION PACKET THAT SHIPPED WITH THE PRODUCT AND ARE INCORPORATED HEREIN BY THIS REFERENCE. IF YOU ARE UNABLE TO LOCATE THE SOFTWARE LICENSE OR LIMITED WARRANTY, CONTACT YOUR CISCO REPRESENTATIVE FOR A COPY.

The Cisco implementation of TCP header compression is an adaptation of a program developed by the University of California, Berkeley (UCB) as part of UCB�s public domain version of the UNIX operating system. All rights reserved. Copyright � 1981, Regents of the University of California.

NOTWITHSTANDING ANY OTHER WARRANTY HEREIN, ALL DOCUMENT FILES AND SOFTWARE OF THESE SUPPLIERS ARE PROVIDED �AS IS� WITH ALL FAULTS. CISCO AND THE ABOVE-NAMED SUPPLIERS DISCLAIM ALL WARRANTIES, EXPRESSED OR IMPLIED, INCLUDING, WITHOUT LIMITATION, THOSE OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT OR ARISING FROM A COURSE OF DEALING, USAGE, OR TRADE PRACTICE.

IN NO EVENT SHALL CISCO OR ITS SUPPLIERS BE LIABLE FOR ANY INDIRECT, SPECIAL, CONSEQUENTIAL, OR INCIDENTAL DAMAGES, INCLUDING, WITHOUT LIMITATION, LOST PROFITS OR LOSS OR DAMAGE TO DATA ARISING OUT OF THE USE OR INABILITY TO USE THIS MANUAL, EVEN IF CISCO OR ITS SUPPLIERS HAVE BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES.

Any Internet Protocol (IP) addresses and phone numbers used in this document are not intended to be actual addresses and phone numbers. Any examples, command display output, network topology diagrams, and other figures included in the document are shown for illustrative purposes only. Any use of actual IP addresses or phone numbers in illustrative content is unintentional and coincidental.

All printed copies and duplicate soft copies are considered un-Controlled copies and the original on-line version should be referred to for latest version.

Cisco has more than 200 offices worldwide. Addresses, phone numbers, and fax numbers are listed on the Cisco website at www.cisco.com/go/offices.

Trademark

Cisco and the Cisco logo are trademarks or registered trademarks of Cisco and/or its affiliates in the U.S. and other countries. To view a list of Cisco trademarks, go to this URL: www.cisco.com/go/trademarks. Third-party trademarks mentioned are the property of their respective owners. The use of the word partner does not imply a partnership relationship between Cisco and any other company. (1110R)

Copyright

� 2015 Cisco Systems, Inc. All rights reserved.