Dear Cisco Customer,
A software issue has been identified with Cisco ACI software releases 1.2(1i) / 11.2(1i) and 1.2(1k) / 11.2(1k) that may affect your use of this software. As a result, these releases have been deferred. These deferred images are no longer available for download. Customers are urged to upgrade to new Cisco ACI software releases 1.2(1m) / 11.2(1m) or later images. See the Software Affected/Solution table that follows for the list of affected images and solution images.
For additional information about what is included in this software, please refer to:
• Cisco Application Policy Infrastructure Controller, Release 1.2(1m), Release Notes, available from the Cisco APIC documentation page
• Cisco NX-OS Release 11.2(1m) Release Notes for Cisco Nexus 9000 Series ACI-Mode Switches, Release Notes, available from the Cisco Nexus 9000 Series Release Notes page
Table Of Affected Software And Replacement Solution
OS Type |
Software Affected |
Software Solution |
Availability (mm/dd/yyy) |
||
Versions |
Software |
Versions |
Software |
||
ACI |
1.2(1i), 1.2(1k) |
Cisco APIC |
1.2(1m) |
Cisco APIC |
02/03/2016 |
NX-OS |
11.2(1i), 11.2(1k) |
Cisco ACI-Mode Switch |
11.2(1m) |
Cisco ACI-Mode Switch |
02/03/2016 |
DDTS No(s):
CSCuy31579
Headline: nginx crash after changing fabric communication policy
On changing any aspects of an existing fabric communication policy, an ACI switch running version 11.2(1i) or 11.2(1k) might reboot if the /nginx partion is full. These policy changes include enabling, disabling, or modifying the configuration of http, https, telnet, ssh, ssl, etc.
In the APIC GUI, this is equivalent to changing anything on the Management Access policy page (Figure 1).
Figure 1. Management Access policy page
Figure 2 illustrates the equivalent configuration options in the APIC CLI, where the fabric communication policy is in the comm-policy (Communication policy) configuration.
Figure 2. Communication policy in CLI
In the API, the class name of the corresponding managed object is commPol and its descendants.
Any change made to these settings might cause a switch reboot.
The /nginx partition is full.
This bug is caused only when the /nginx partition is full and an nginx configuration change request is unable to write to the partition. When any change is made to the fabric communication policy, in either the GUI, CLI, or API, a new configuration file is created for nginx and nginx is restarted. If the /nginx partition is full, the new configuration file creation fails and subsequent nginx restarts also fail, which eventually leads to a switch reload.
An example of a configuration change request for nginx is enabling TLSv1, which is performed here in the GUI:
Fabric > Fabric Policies > Pod Policies > Policies > Management Access > default > HTTPS > SSL Protocols
Before making configuration changes to the Communication Management policy (Management Access policy), verify that the utilization of the /nginx partition is below 70%. Because policy changes made in APIC are propogated to all switches, you must check the utilization on all switches.
To check the current utilization of the /nginx partition, log in to each switch as admin and display the size of the /nginx partition using the following command:
leaf101# egrep " /nginx" /tmp/df_output
none 204800 2952 201848 2% /nginx
In the example above, the current utilization is 2%. If the current /nginx partition utilization is above 70%, contact the TAC to apply the workaround before making the policy configuration changes.
TAC assistance is required for applying workaround (root credentials are needed). Follow these steps to truncate /nginx/logs/access.log to reduce the size of the /nginx partition to avoid reaching the reload condition:
1. Log in to each leaf as root.
2. Type the following commands:
leaf101# ls -lrt /nginx/logs/access.log
-rw-r--r-- 1 root root 209063936 Feb 17 02:18 /nginx/logs/access.log
leaf101# echo . > /nginx/logs/access.log
Note that this workaround must be applied periodically whenever you find that usage of /nginx partition is over 70% utilized.
CSCuy31579 - nginx crash after changing fabric communication policy
A customer might experience a reboot of a switch if the /nginx partition is full. This will cause user traffic disruption.
The output below shows an example seen on affected switches:
Leaf101# show system reset-reason
*************** module reset reason (1) *************
0) At 2016-02-16T10:08:00.839+01:00
Reason: reset-triggered-due-to-ha-policy-of-reset
Service:nginx hap reset
Version: 11.2(1k)
Any operation frequently writing logs to the /nginx partition is likely to set the condition for this issue by filling up the /nginx partition. Such operations include REST queries executed locally on the switch. In addition, in 1.2(1i) and 1.2(1k) releases, there is a fabric tracking process that creates log events written to the /nginx partition. In these releases, it is possible for the log partition to be full, and for the switches to reboot when a change is made to the Management Access policy.
An example of a REST query on the switch is ‘show’ commands executed directly from bash.
In order to increase network availability, Cisco recommends that you upgrade affected images with the suggested replacement software images. Cisco will discontinue manufacturing shipment of affected images. Any pending order will be substituted by the replacement software images.
PLEASE BE AWARE THAT FAILURE TO UPGRADE THE AFFECTED SOFTWARE MAY RESULT IN NETWORK DOWNTIME.
The terms and conditions that governed your rights and obligations and those of Cisco, with respect to the deferred software will apply to the replacement software.
© 2016 Cisco Systems, Inc. All rights reserved.