Guest

Cisco Videoscape Distribution Suite for Internet Streaming

Release Notes for Cisco Internet Streamer CDS 2.5.11

  • Viewing Options

  • PDF (570.5 KB)
  • Feedback
Release Notes for Cisco Internet Streamer CDS 2.5.11

Table Of Contents

Release Notes for Cisco Internet Streamer
CDS 2.5.11

Contents

New Features

Windows Media Streaming SDP Caching

Windows Media Streaming STB Seek Latency Reduction

Windows Media Streaming Tracking for Client Billing

Windows Media Streaming ASX Files with URL Signing

ASX File Request Flow

Service Rule XML Example with Rule_UrlGenerateSign

Windows Media Streaming Scheduled Live Programs Blocking

Web Engine Maximum Size for Cachable Objects

CAL Queue Limits

LSR Path Caching for Windows Media Live Streaming

Service Monitor Transaction Logs and Augmentation Alarms

Service Monitor Transaction Logs

Augmentation Alarms

System Requirements

Limitations and Restrictions

Important Notes

Open Caveats

Open Caveats in Release 2.5.11-b26

Open Caveats in Release 2.5.11-b21

Open Caveats in Release 2.5.11-b20

Open Caveats in Release 2.5.11-b19

Open Caveats in Release 2.5.11-b18

Open Caveats in Release 2.5.11-b15

Open Caveats in Release 2.5.11-b13

Windows Media Streaming

CDSM

Acquisition and Distribution

Unified Kernel Streaming Engine (UKSE)

Web Engine

Cache Router

Proximity Engine

Ucache

Service Router

MP3 Live Streaming

Resolved Caveats

Resolved Caveats in Release 2.5.11-b26

Resolved Caveats in Release 2.5.11-b21

CDMs

RTSP

Transaction Logging

UNS

Web Engine

WMT

Resolved Caveats in Release 2.5.11-b20

Service Routing

Authorization Server

SVC Monitor

Resolved Caveats in Release 2.5.11-b19

Service Routing

Streamer

Web Engine

Web Services

WMT

Resolved Caveats in Release 2.5.11-b18

CDM

Proximity

URL Manager

Web Engine

Resolved Caveats in Release 2.5.11-b15

Movie Streamer

SNMP

URL Manager

Web Engine

Live Routing

CDSM

HTTP Stress

Authorization Server

Cache Router

Platform

MP3 Live

Resolved Caveats in Release 2.5.11-b13

Windows Media Streaming

Web Engine

Flash Media Streaming

Authorization Server

Network

HTTP Core

HTTP ABR

URL Manager

CDSM

Stream Scheduler

Transaction Logs

UNS

Geo-Location Server

Service Router

Service Monitor

Unified Kernel Streaming Engine (UKSE)

Ucache

Platform

Live Routing

CLI

RTSP Gateway

Movie Streamer

Data Server

Accessing Bug Tool kit

Upgrading to Release 2.5.11

Source Policy Routes

URL Public Key Signing

SATA Disk Error Handling and Threshold Recommendations

Configuration Recommendations

Root Cause

Documentation Updates

Related Documentation

Obtaining Documentation and Submitting a Service Request


Release Notes for Cisco Internet Streamer
CDS 2.5.11


These release notes cover Cisco Internet Streamer CDS Release 2.5.11-b26.


Note Release 2.5.11-b26 obsoletes all previous Release 2.5.11 builds.


Revised: March 2013, OL-25177-10

Contents

The following information is included in these release notes:

New Features

System Requirements

Limitations and Restrictions

Important Notes

Open Caveats

Resolved Caveats

Accessing Bug Tool kit

Upgrading to Release 2.5.11

Documentation Updates

Related Documentation

Obtaining Documentation and Submitting a Service Request

New Features

Release 2.5.11 of the Cisco Internet Streamer CDS introduces the following features:

Windows Media Streaming SDP Caching

Windows Media Streaming STB Seek Latency Reduction

Windows Media Streaming Tracking for Client Billing

Windows Media Streaming ASX Files with URL Signing

Windows Media Streaming Scheduled Live Programs Blocking

Web Engine Maximum Size for Cachable Objects

LSR Path Caching for Windows Media Live Streaming

Service Monitor Transaction Logs and Augmentation Alarms

Windows Media Streaming SDP Caching

Live streaming is content that is streamed while it is still being encoded by an encoder. There are two kinds of Windows Media live streaming

Playlist live—One or more content items are streamed sequentially.

Broadcast live—Live and prerecorded content can be streamed to more than one client simultaneously. The SE streams the content to all clients, which does not allow the clients to perform seeks on the stream.

Streaming is accomplished by using HTTP live or RTSP live. HTTP live uses Windows Media Streaming Protocol (MS-WMSP) where the wms-hdr in the WMS-Describe-Response describes the content. RTSP live uses RTSP where the Session Description Protocol (SDP) file in the DESCRIBE response describes the content.

The RTSP playlist live SDP file cannot be cached because the SDP file keeps changing to reflect the different content playlists.

Previously, getting the SDP file for RTSP broadcast live was accomplished by the Windows Media Streaming engine sending an RTSP DESCRIBE message to the upstream SE to retrieve the SDP file for each Windows Media Streaming broadcast live request. The RTSP DESCRIBE message eventually reached the Content Acquirer or Origin server, which created a bottleneck for Windows Media Streaming broadcast live.

In Release 2.5.11, because the SDP file for RTSP broadcast live does not change unless the program is stopped, it can be cached on the streaming SE. Once the SDP file is cached, it can be used to compose the DESCRIBE response. No further requests for the SDP file from the upstream server (SE, Content Acquirer, or Origin server) are necessary, which eliminates the bottleneck.


Note The SDP file cannot be cached if content requires authorization by either the Origin server or the SE.


Windows Media Streaming STB Seek Latency Reduction

Previously, when a client issued a seek operation, such as skipping forward or backwards, a new point in the stream time is specified in the request header, and the SE starts the stream with a new data packet for that new point. The new data packet may or may not contain the I-frame payload. The client (STB) may fill the memory buffer before receiving the I-frame, and because of the small buffer size on STBs, the buffer could overflow before the I-frame is received.

In Release 2.5.11, after a client issues a seek operation and before the SE streams the new data packet, the Internet Streamer CDS checks the data packets for the I-frame and chooses the most appropriate one as the first packet (the I-frame might have been fragmented into more than one data packet).

Windows Media Streaming Tracking for Client Billing

When a client abnormally disconnects from the SE and does not send a log, Windows Media Streaming generates a log event (408 log code) even though the client has not sent a log.

When the server generates logging data, it uses the server-side values that are available and any client values that were transferred at the beginning of the session. The value in the x-duration field, which is used to track client billing, is based on the values of the following fields: date, time, c-starttime, c-rate, and filelength. The date, time, and filelength field values are known at the client connection time. The x-duration and c-starttime are determined by the CDS based on the media content packet's playtime.

Previously, the x-duration field in the Windows Media Streaming transaction log either used the client value or no value. In Release 2.5.11, the x-duration field has the value based on the following:

If the client does not report a value, indicated by a hyphen (-), the server value is used.

If the value is reported as zero, then the server duration value is used.

If the server compares the x-duration reported by the client with the filelength and the difference is more than two minutes, then the server's duration value is used.

Windows Media Streaming ASX Files with URL Signing

The Windows Media Streaming ASX Files with URL Signing feature adds a new Rule_Actions to the Service Rules XML schema, the Rule_UrlGenerateSign.

When the playback URL for a Windows Media Streaming live program has an ASX extension, the Content Abstraction Layer (CAL) returns metadata with an ASX file generated that contains both an HTTP URL and an RTSP URL for playback of the live program. These two URLs should be signed so that subsequent requests to playback the live program can be validated by the SE.

The Rule_UrlGenerateSign Rule_Action provides the ability to internally generate URL signatures using Version 2 of the URL signing script (SHA-1 encryption, protocol removed from beginning of the URL, and domain name not included). When the signed URL is sent back to the client as part of the ASX response, the domain name received from the client is added back in.

ASX File Request Flow

The request flow is as follows:

1. Client requests an ASX file.

2. A Service Rule XML file is configured for the delivery service that contains the new Rule_Action, Rule_UrlGenerateSign. The Rule_UrlGenerateSign Rule_Action element requires the following attribute values: Key Owner, Key Number, and timeout. If the timeout attribute value is not specified, the default value of 30 seconds is used. The range for the timeout value is from 0 to 50 seconds.

3. If the pattern for Rule_UrlGenerateSign is matched, the URL signature is generated by the SE using Version 2 of the URL signing script and the attribute values specified for the Rule_UrlGenerateSign element.

Internally signed URLs will have IS=1. The IS=0 string is for legacy support with some CDS components that use both internal and external signing mechanisms.

Both the HTTP and RTSP signed URLs are contained in the ASX file. The signed URL that is used is determined by which protocol (HTTP or RTSP) is allowed or disallowed in the Windows Media Streaming configuration.


Note If Windows Media Streaming is disabled, a 500 internal server message is sent to the client. The ASX file is not generated if Windows Media Streaming is disabled.


4. The client receives the ASX file with the signed URL. The player parses the ASX file and sends out the request again with the signed URL. The SE receives the signed URL and validates it. If the validation succeeds, the client is served the content.

The Service Rule XML file has to be created and uploaded through the CDSM GUI, then assign to the delivery service.

Service Rule XML Example with Rule_UrlGenerateSign

Following is an example of a Service Rule XML file that has the Rule_UrlGenerateSign element:

<CDSRules xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:noNamespaceSchemaLocation="schema\CDSRules.xsd">
        <Revision>1.0</Revision>
        <CustomerName>Cisco</CustomerName>
        <Rule_Patterns>
              <PatternListGrp id = "grp1">
                   <Domain>cisco.co</Domain >
              </PatternListGrp>
        </Rule_Patterns>
 
   
        <Rule_Actions>
                <Rule_UrlGenerateSign matchGroup = "grp1" protocol = "http" 
		      key-id-owner="1" key-id-number="2" timeout-in-sec="30"/>
        </Rule_Actions>
</CDSRules>
 
   

Service Rule Action Order

The Rule_Actions processing is the same in Release 2.5.9 and Release 2.5.11; all Rule_Actions are processed in the same order as they are listed in the Rule_Actions element. However, in Release 2.5.11, for Rule_Validate and Rule_UrlGenerateSign, if the pattern is matched, and the URL validation or URL generation fails and there is a Rule_UrlRewrite or Rule_NoCache listed before, neither will be performed. Because the Rule_Validate or Rule_UrlGenerateSign process failed (validation or generation respectively), the authserver returns Action_Deny and the corresponding rule action (either Action_validate or Action_UrlGenerateSign). The Action_rewrite is not returned, nor is the action for Rule_NoCache if it is listed. This is true whenever Rule_Validate or Rule_UrlGenerateSign is listed, the pattern is matched, and the action fails (either URL validation or URL signing fails).

If either Rule_Validate or Rule_UrlGenerateSign is listed, the pattern is matched, and the action is successful, and if Rule_UrlRewrite is listed, then the Action_rewrite is returned and so is the Action_validate and Action_UrlGenerateSign (if all three rules are listed).

Service Rule Processing

This section describes the rule processing.


Note Pattern match failure as described in this section means that none of the patternGrps specified as part of the matchGroup matched for a particular action.


Rule_Allow

If pattern match fails, the request is blocked and there is no further processing of the remaining rules.

If pattern match is successful, rule processing continues to the next rule action.

Rule_Block

If there is a pattern match for Rule_Block, the request is blocked and there is no further processing of the remaining rules.

If there is no pattern match for Rule_Block, rule processing continues to the next rule action.

Rule_UrlRewrite, Rule_NoCache, Rule_Validate. Rule_UrlGenerateSign—Pattern Match Failure Case

If pattern match fails, rule processing continues to the next rule action and there is no return value for the specified rule action. For example, if the rule action was Rule_Validate and the pattern match failed, there would be no URL validation performed on the request.

In the following XML example, because the pattern match failed for the action Rule_Validate, authserver does not return Action_validate. Because the Rule_UrlRewrite and Rule_UrlGenerateSign pattern matches were successful, authserver returns those actions in its response.

<CDSRules xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:noNamespaceSchemaLocation="schema\CDSRules.xsd">
        <Revision>1.0</Revision>
    <CustomerName>ATT</CustomerName>
        <Rule_Patterns>
                <PatternListGrp id = "grp1">
                        <UrlRegex>asx</UrlRegex>
                </PatternListGrp>
                <PatternListGrp id = "grp2">
                                <UrlRegex>abcd</UrlRegex>
                </PatternListGrp>
        </Rule_Patterns>
        <Rule_Actions>
        <Rule_UrlGenerateSign matchGroup = "grp1" key-id-owner = "1" key-id-number = "1" 
timeout-in-sec = "30" protocol = "http" />
        <Rule_Validate matchGroup = "grp2"  error-redirect-url="http://4.0.1.6/index.html" 
protocol = "http" />
        <Rule_UrlRewrite matchGroup = "grp1" protocol = "http" regsub = "DejaVu" 
rewrite-url = "dummy" />
        </Rule_Actions>
</CDSRules>
 
   

Rule_UrlRewrite, Rule_No_Cache, Rule_Validate, Rule_UrlGenerateSign—Pattern Match Success Case

If pattern match is successful, the actions are processed as described in the following subsections:

Rule_Validate, Rule_UrlGenerateSign—Validation Fails, Signing Fails, Configuration Failure

FRule_UrlRewrite and Rule_NoCache—Rewrite Fails

Rule_UrlRewrite, Rule_NoCache, Rule_Validate, Rule_UrlGenerateSign—Success

Rule_Validate, Rule_UrlGenerateSign—Validation Fails, Signing Fails, Configuration Failure

Rule_Validate and Rule_UrlGenerateSign have a higher priority than Rule_UrlRewrite or Rule_NoCache. If the pattern matches, but the function fails (URL validation fails, URL signing fails, or there is a configuration failure), there is no further processing of the rule actions and the request is denied.

authserver returns [Action_Deny + Action_validate] if validation/UrlSignature generation fails.

authserver returns [Action_Deny + Action_UrlGenerateSign] if UrlSignature generation fails.

Also, the value from previous actions is not returned in either case. For example, if Rule_UrlRewrite preceded Rule_UrlGenerateSign, and Rule_UrlRewrite was successful, but Rule_UrlGenerateSign failed, authserver does not return the value for Action_Rewrite. Similarly, if Rule_UrlRewrite preceded Rule_Validate, and Rule_UrlRewrite was successful, but Rule_Validate failed, authserver would not return the value for Action_Rewrite. The same logic that is described for Rule_UrlRewrite applies to Rule_NoCache as well.

The following XML example illustrates the above scenarios:

<CDSRules xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:noNamespaceSchemaLocation="schema\CDSRules.xsd">
        <Revision>1.0</Revision>
    <CustomerName>ATT</CustomerName>
        <Rule_Patterns>
                <PatternListGrp id = "grp1">
                        <UrlRegex>asx</UrlRegex>
                </PatternListGrp>
                <PatternListGrp id = "grp2">
                                <UrlRegex>abcd</UrlRegex>
                </PatternListGrp>
        </Rule_Patterns>
        <Rule_Actions>
        <Rule_UrlRewrite matchGroup = "grp1" protocol = "http" regsub = "DejaVu" 
rewrite-url = "dummy" />
        <Rule_UrlGenerateSign matchGroup = "grp1" key-id-owner = "1" key-id-number = "1" 
timeout-in-sec = "30" protocol = "http" />
        <Rule_Validate matchGroup = "grp2"  error-redirect-url="http://4.0.1.6/index.html" 
protocol = "http" />
        </Rule_Actions>
</CDSRules>
 
   

FRule_UrlRewrite and Rule_NoCache—Rewrite Fails

Rule_UrlRewrite and Rule_NoCache have a lower priority than Rule_Validate and Rule_UrlGenerateSign. If the pattern matches, but the Rule_UrlRewrite or Rule_NoCache fails, authserver does not return Action_Deny and processing of remaining rules actions continues. If Rule_UrlRewrite fails, authserver does not return the value for Action_Rewrite. If Rule_NoCache fails, authserver does not return its value.

Rule_UrlRewrite, Rule_NoCache, Rule_Validate, Rule_UrlGenerateSign—Success

If the Rule_UrlRewrite action is successful, authserver response contains the Action_Rewrite and the new rewritten URL is sent. Processing of the remaining rules actions continues.

If the Rule_NoCache action is successful, authserver sends the instructions to not cache the content. Processing of the remaining rules actions continues.

If Rule_Validate is successful, authserver response contains the Action_Validate.

If Rule_UrlGenerateSign is successful, authserver response contains Action_UrlGenerateSign.

Windows Media Streaming Scheduled Live Programs Blocking

Previously, Windows Media Streaming live programs that were scheduled only played during the scheduled time (which is as expected), but the connected streams that were established continued to play indefinitely.

In Release 2.5.11, the ability to configure whether currently streaming live programs should be stopped when the scheduled time has ended has been added to the CDSM GUI and the live program API.

CDSM GUI

To configure this feature in the CDSM GUI, choose Services > Live Video > Live Programs, click the Edit icon next to the program name. the Program Definition page is displayed with the Block per Schedule check box. Check the Block per Schedule check box and click Submit to stop active streams when the schedule ends.

A new debug command has been added to enable stream-scheduler debug level.

# debug stream-scheduler ?
   error Stream-scheduler debug level set to error
   trace Stream-scheduler debug level set to trace
 
   

API

A new attribute, blockPerSchedule, has been added to the Document Type Definition (DTD) for CDS program files. You can use the DTD to create program files for importing programs from third-party systems. The definition for this attribute is the following:

blockPerSchedule    (false \ true) "false"
 
   

Following is an example of a program file for Windows Media Streaming live program with the blockPerSchedule attribute:

<?xml version="1.0"?>
<!DOCTYPE program SYSTEM "program.dtd">
<program version="1.0" name="liveProgram" serviceType="wmt" description="test" 
autoDelete="true" live="true" blockPerSchedule="false">
<media index="1" src="http://WMT_encoder:8080" id="media0"/>
<mcastInfo referenceUrl="http://contentacquirer/liveprogram.nsc" TTL="22">
<addrPort addrVal="239.232.25.95" portVal="61248" id="media0"/>
</mcastInfo>
<schedule timeSpec="gmt" startTime="0" activeDuration="0"/>
</program>
 
   

Web Engine Maximum Size for Cachable Objects

When a request comes into an SE and the requested file is a cache miss, the request is forwarded to the upstream SE, and if the content is not found in the CDS, to the Origin server. The response from the Origin server or the upstream SE contains the content length. If the value of the content length is less than 2 MB, the asset file is treated as a small file. If the content length is greater than 2 MB, the asset is treated as a large file. Small files are cached in RAM (tmpfs) first, then after approximately four seconds, the file is moved to disk. Large files are stored on disk directly. So the response time for a small file is much less compared to that of a large file.

Previously, the delimiter between a small and large file was hard coded as 2 MB.

In Release 2.5.11, the Memory Cache Size field has been added to allow configuring this delimiter for each delivery service.

The Memory Cache Size value should be carefully chosen because it affects the performance and response time for that delivery service. When picking a value that defines a small file versus a large file for a delivery service, not only should the file size of the majority of the traffic be considered, but also the hardware of all the SEs in that particular delivery service (the value is limited by the hardware of weakest link in the chain of SEs in the delivery service), the bit-rate setting, and other parameters.

To configure the Memory Cache Size field, do the following:


Step 1 Log in to the CDSM GUI.

Step 2 Choose Services > Service Definition > Delivery Services, click the Edit icon next to the delivery service. The Delivery Service Definition page is displayed.

Step 3 Click General Settings. The General Settings page is displayed.

Step 4 In the Memory Cache Size field, enter the maximum file size (in MB) that defines a small file.

The range is from 1 to 10 MB. The default is 2 MB.

Step 5 Click Submit.


When the cache memory (/tmpfs) reaches capacity, which means there is no more space for small files, Web Engine performs a cache bypass and sends the file directly to the client. Previously, the small file was stored on disk, which increased the response time.

CAL Queue Limits

In addition to the Memory Cache Size field being added, the CAL queue is now limited to 2000 tasks. When the CAL queue threshold of 2000 is exceeded, Web Engine does not add anymore disk operation tasks (creates, updates, or popularity updates) and a trace message is logged with the following string:

Reason: CalQThreshold Exceeded!
 
   

A new output field, "Outstanding Content Popularity Update Requests," has been added to the show statistics web-engine detail command. At any point, the sum of the "Outstanding Content Create Requests," "Outstanding Content Update Requests," and "Outstanding Content Popularity Update Requests" output fields is always less than 2000. If the sum of these three output fields exceeds the CAL queue threshold, no more create, update, and popularity update tasks are performed and the "Reason: CalQThreshold Exceeded!" trace message is logged.

LSR Path Caching for Windows Media Live Streaming

Previously, each incoming Windows Media Streaming live request caused a CAL lookup resolve, which resulted in the Live Service Routing (LSR) module sending liveness queries to some SEs in the same location and upper-tier locations to derive hierarchical splitting URL.

In Release 2.5.11, to avoid doing a CAL lookup resolve for each incoming Windows Media Streaming live request, the live hierarchical splitting URL is cached and is then used by all subsequent Windows Media Streaming live requests for the same live program.

Service Monitor Transaction Logs and Augmentation Alarms

Service Monitor in the Internet Streamer CDS provides threshold monitoring of the various components (CPU, disk, memory, and so on) of the devices (SE, SR, and CDSM), as well as the protocol engines on the SEs.

Release 2.5.11 introduces Service Monitor transaction logs to provide an additional tool for analyzing the health history of a device and the protocol engines, and additional augmentation alarms to ensure the device is within the configured capacity limits.

Service Monitor Transaction Logs

The device and service health information are periodically logged on the device in transaction log files. Transaction logs provide a useful mechanism to monitor and debug the system. The transaction log fields include both device and protocol engine information applicable to Service Engines and Service Routers that are useful for capacity monitoring. Additionally, when a device or protocol engine threshold is exceeded, detailed information is sent to a file (threshold_exceeded.log) to capture the processes that triggered the threshold alarm.

The Service Monitor transaction log filename has the following format: service_monitor_<ipaddr>_yyyymmdd_hhmmss_<>, where:

<ipaddr> represents the IP address of the SE, SR, or CDSM.

yyyymmdd_hhmmss represents the date and time when the log was created.

For example, service_monitor_192.168.1.52_20110630_230001_00336 is the filename for the log file on the device with the IP address of 192.168.1.52 and a time stamp of June 30, 2011 at 3:36 AM.

The Service Monitor transaction log file is located in the /local1/logs/service_monitor directory.

An entry to the Service Monitor transaction log is made every two seconds.


Note The following rules apply to Service Monitor transaction logs:

A transaction log value is only logged if the Service Monitor is enabled for that component or protocol engine on the device. For example if CPU monitoring is not enabled, the transaction log value "-" is displayed.

If Service Monitor is enabled for a protocol engine, but the protocol engine is not enabled, the value is not displayed in the log file.

If a log field can have more than one value, the values are delimited by the pipe (|) character.

If a value can have sub-values, the sub-values are delimited by the carrot (^) character.

Some of the fields display aggregate values. If the statistics are cleared using the clear statistics command, the value after clearing the statistics may be less than the previous values, or may be zero (0).


Table 1 describes the fields for the Service Monitor transaction log on an SE.

Table 1 SE Service Monitor Transaction Log Fields 

Field
Sample Output
Description
Corresponding CLI Command

date

2011-06-30

Date of log.

-

time

22:52:02

Time of log.

-

cpu_avg

21

Moving average value in percentage of CPU usage.

show service-router service-monitor
Device status—CPU—Average load

mem_avg

44

Moving average value in percentage of memory usage.

show service-router service-monitor
Device status—Mem—Average used memory

kmem_avg

11

Moving average value in percentage of kernel memory.

show service-router service-monitor
Device status—KMEM—Average kernel memory

disk_avg

2

Moving average value in percentage of disk usage.

show service-router service-monitor
Device status—Disk—Average load

disk_fail_count_
threshold

Y

Boolean value to indicate if disk fail count threshold has been reached.

show service-router service-monitor
Device status—Device Status—Disk—Status

per_disk_load

disk03-01^2|
disk04-02^5

Current load per disk, as a percentage. The sample output indicates that disk03-partition01 has a 2 percent load and disk04-partition02 have a 5 percent load.

-

bandwidth_avg

Port_Channel_1^2^4|
Port_Channel_2^0^0

Moving average bandwidth used, as a percentage, of bandwidth in and bandwidth out per interface. The sample output indicates that port channel 1 has an average bandwidth of 2 percent for receiving and 4 percent for transmitting, and port channel 2 average bandwidth usage is 0.

show service-router service-monitor
Device status—NIC—Average BW In/ Average BW Out

file_descriptors_count

1023

Total count of file descriptors open on the device. File descriptors are internal data structures maintained by the Linux kernel for each open file.

-

tcp_server_connections

35

Number of TCP server connections open.

show statistics tcp
TCP Statistics—Server connection openings

tcp_client_connections

24

Number of TCP client connections open.

show statistics tcp
TCP Statistics—Client connection openings

processes_count

42

Number of processes running on the device.

show processes

dataserver_cpu

1

Percentage of the CPU used for the dataserver process.

-

ms_threshold_exceeded

-

Boolean value to indicate if the Movie Streamer threshold has been exceeded.

show service-router service-monitor
Services status—MS—Threshold

ms_aug_threshold_
Exceeded

-

Boolean value to indicate if Movie Streamer augmentation alarm threshold has been exceeded.

-

ms_stopped

-

Boolean value to indicate if the Movie Streamer protocol engine has stopped.

show service-router service-monitor
Services status—MS—Stopped

ms_rtsp_sessions_
count

-

Total Movie Streamer RTSP session count (aggregate value).

show statistics movie-streamer all
Total RTSP sessions

ms_rtp_sessions_count

-

Total Movie Streamer RTP session count (aggregate value).

show statistics movie-streamer all
Total RTP connections

fms_threshold_
exceeded

N

Boolean value to indicate if threshold is exceeded.

show service-router service-monitor
Services status—FMS—Threshold

fms_aug_threshold_
exceeded

N

Boolean value to indicate if Flash Media Streaming augmentation alarm threshold has been exceeded.

-

fms_stopped

N

Boolean value to indicate if Flash Media Streaming has stopped.

show service-router service-monitor
Services status—FMS—Stopped

fms_connections_count

2

Total Flash Media Streaming connection count (aggregate value).

show statistics flash-media-streaming
Connections—Total

web_ engine_
threshold_exceeded

Y

Boolean value to indicate if the Web Engine threshold has been exceeded.

show service-router service-monitor
Services status—Web—Threshold

web_ engine_aug_

threshold_exceeded

Y

Boolean value to indicate if Web Engine augmentation alarm threshold has been exceeded.

-

web_ engine_stopped

N

Boolean value to indicate if Web Engine has stopped.

show service-router service-monitor
Services status—Web—Stopped

web_engine_cpu

3

Percentage of the CPU used by the Web Engine.

-

web_engine_mem

3500

Memory (in bytes) used by the Web Engine.

show web-engine health
Total memory usage

web_engine_get_
requests

250

Count of get requests received by the Web Engine (Aggregate value)

show statistics web-engine detail
HTTP Request Type Statistics—Get requests

web_engine_sessions
(Not available in 2.5.11)

5

Count of HTTP connections.

show statistics web-engine detail
Web-Engine Detail Statistics—Total HTTP Connection + Active Session

web_engine_upstream_
connections
(Not available in 2.5.11)

2

Count of HTTP connections to upstream SE or origin server.

show statistics web-engine detail
Web-Engine Detail Statistics—Total HTTP Connection

wmt_threshold_
exceeded

N

Boolean value to indicate if Windows Media Streaming threshold has been exceeded.

show service-router service-monitor
Services status—WMT—Threshold

wmt_aug_threshold_
exceeded

N

Boolean value to indicate if the Windows Media Streaming augmentation alarm threshold has been exceeded.

-

wmt_stopped

Y

Boolean value to indicate if Windows Media Streaming has stopped.

show service-router service-monitor
Services status—WMT—Stopped

wmt_ml_engine_cpu

21

Percentage of the CPU used by the WMT_ML process.

-

wmt_ml_engine_mem

32456

Memory (in bytes) used by WMT_ML process

-

wmt_core_engine_cpu

21

Percentage of the CPU used by the WMT_Core process.

-

wmt_core_engine_mem

32456

Memory (in bytes) used by the WMT_Core process.

-

wmt_unicast_sessions_
count

22

Number of current concurrent unicast client sessions.

show statistics wmt usage
Concurrent Unicast Client Sessions—Current

wmt_remote_sessions_
count

24

Number of current concurrent remote server sessions.

show statistics wmt usage
Concurrent Remote Server Sessions

wmt_live_requests

21

Total count of Windows Media Streaming live requests (Aggregate value).

show statistics wmt requests
By Type of Content—Live content

wmt_vod_requests

22

Total count of Windows Media Streaming VOD requests (Aggregate value).

show statistics wmt requests
By Type of Content—On-Demand Content

wmt_http_requests

11

Total count of Windows Media Streaming HTTP requests (Aggregate value).

show statistics wmt requests
By Transport Protocol—HTTP

wmt_rtsp_requests

8

Total count of Windows Media Streaming RTSP requests (Aggregate value).

show statistics wmt requests
By Transport Protocol—RTSPT/RTSPU

rtspg_tps

12

Current RTSP Gateway transactions per second (TPS).

-

uns_cpu

3

Percentage of CPU used by the Unified Namespace (UNS) process.

-

uns_mem

3500

Memory used by the UNS process.

-


Table 2 describes the fields for the Service Monitor transaction log on a SR.

Table 2 SR Service Monitor Transaction Log Fields 

Field
Sample Output
Description
Corresponding CLI Command

date

2011-06-30

Date of log.

-

time

22:52:02

Time of log.

-

cpu_avg

21

Moving average value in percentage of CPU usage.

show service-router service-monitor
Device status
—CPU—Average load

mem_avg

44

Moving average value in percentage of memory usage.

show service-router service-monitor
Device status
—Mem—Average used memory

kmem_avg

11

Moving average value in percentage of kernel memory.

show service-router service-monitor
Device status—KMEM—Average kernel memory

disk_avg

2

Moving average value in percentage of disk usage.

show service-router service-monitor
Device status—Disk—Average load

disk_fail_count_
threshold

Y

Boolean value to indicate if disk fail count threshold has been reached.

show service-router service-monitor
Device status—Device Status—Disk—Status

file_descriptors_count

1023

Total count of file descriptors open on the device. File descriptors are internal data structures maintained by the Linux kernel for each open file.

-

tcp_server_connections

35

Number of TCP server connections open.

show statistics tcp
TCP Statistics—Server connection openings

tcp_client_connections

24

Number of TCP client connections open.

show statistics tcp
TCP Statistics—Client connection openings

processes_count

42

Number of processes running on the device.

show processes

dataserver_cpu

1

Percentage of the CPU used for the dataserver process.

-

sr_cpu

12

Cpu percentage used by SR.

-

sr_mem

750000

Memory (in bytes) used by SR.

show processes memory and search for service_router

requests_received

34

Total count of requests received by SR (aggregate value)

show statistics service-router summary
Requests Received

http_normal_requests_
received

5

Total count of normal HTTP requests received by SR (aggregate value).

show statistics service-router summary
HTTP Requests (normal)

http_asx_requests_
received

5

Total count of ASX HTTP requests received by SR (aggregate value).

show statistics service-router summary-
HTTP Requests (ASX)

rtsp_requests_received

5

Total count of RTSP requests received by SR (aggregate value).

show statistics service-router summary
RTSP Requests

rtmp_requests_received

5

Total count of RTMP requests received by SR (aggregate value).

show statistics service-router summary
RTMP Requests

dns_requests_received

6

Total count of DNS requests received by SR (aggregate value).

show statistics service-router dns
Total DNS queries


Configuring Service Monitor Transaction Logs

Transaction logging for Service Monitor is disabled by default. To enable the Service Monitor transaction logging, enter the following commands:

Device(config)# transaction-logs enable
Device(config)# service-router service-monitor transaction-log enable
 
   

To disable Service Monitor transaction logging, enter the no service-router service-monitor transaction-log enable command.

The show service-router service monitor command output displays some of the values monitored and logged.

Augmentation Alarms

Service Monitor currently raises an alarm after a user-configured threshold is exceeded. In Release 2.5.11, Service Monitor has been enhanced with augmentation alarms, which are soft alarms that send alerts before the threshold is reached. These alarms are applicable to all devices—Service Engines, Service Routers and CDSMs. Augmentation thresholds apply to device and protocol engine parameters.

Device-Level Augmentation Alarms

A different augmentation alarm is supported for each of the device-level thresholds. Based on the device parameters monitored by Service Monitor, the following minor alarms could be raised:

Alarm 560007 (CpuAugThreshold) Service Monitor CPU augmentation alarm.

Alarm 560008 (MemAugThreshold) Service Monitor memory augmentation alarm.

Alarm 560009 (KmemAugThreshold) Service Monitor kernel memory augmentation alarm.

Alarm 560011 (DiskAugThreshold) Service Monitor disk augmentation alarm.

Alarm 560012 (DiskFailCntAugThreshold) Service Monitor disk failure count augmentation alarm.

Alarm 560010 (NicAugThreshold) Service Monitor NIC augmentation alarm.

Check augmentation threshold, device-level threshold, and average load for the above alarm instance. Add more devices if necessary. A useful command is the show service-router service-monitor command.

The augmentation alarms raised are displayed in the show alarms detail command. The alarms are cleared when the load goes below the augmentation threshold.


Note For system disks (disks that contain SYSTEM partitions), only when all system disks are bad is the diskfailure augmentation and threshold alarms raised. The diskfailcnt threshold does not apply to system disks. The threshold only applies to CDNFS disks, which is also the case for the augmentation thresholds. This is because the system disks use RAID1. There is a separate alarm for bad RAID. With the RAID system, if the critical primary disk fails, the other mirrored disk (mirroring only occurs for SYSTEM partitions) seamlessly continues operation. However, if the disk drive that is marked bad is a critical disk drive (by definition this is a disk with a SYSTEM partition), the redundancy of the system disks for this device is affected. For more information on disk error handling and threshold recommendations, see the "SATA Disk Error Handling and Threshold Recommendations" section.

As the show disk details command output reports, if disks have both SYSTEM and CDNFS partitions, they are treated as only system disks, which means they are not included in the accounting of the CDNFS disk calculation.



Note The NIC augmentation alarm is only applicable if the device is a Service Engine.


RTSP Gateway Overload Alarm

Service Monitor checks if the RTSP gateway TPS overload alarm is raised and sets the Windows Media Streaming and Movie Streamer threshold exceeded states in the keepalive message sent to SR. This ensures that the SR does not redirect RTSP requests to the Service Engine when the RTSP gateway TPS is overloaded.

The new RTSP TPS alarm is raised when the RTSP gateway maximum transaction rate, which can be configured through the CLI, is met. Following are the alarm details:

Alarm 512001 (tpsquotaexceed) RTSP request rate has reached service threshold limit.

severity=major

RTSP request rate has reached service threshold limit. Further requests will wait in TCP queue until the service quota is refilled in the next two seconds.

RTSP gateway alarm is checked at two-second intervals. The RTSP gateway TPS value is updated in two-second intervals.

RTSP Gateway Augmentation Alarms

The following augmentation alarm is used for the RTSP gateway:

Alarm 511017 (rtspgaugmentexceeded) RTSP gateway TPS has reached augmentation threshold limits.

severity=minor

RTSP gateway TPS has reached augmentation limits on maximum concurrent connections allowed bandwidth.

No service disruption. Monitor device to see if it exceeds service threshold limits and add more devices if necessary.

Web Engine Augmentation Alarms

The following augmentation alarms are used for Web Engine:

Alarm 9000011 (aug_memory_exceeded) Web Engine augmentation memory threshold exceeded

severity=minor

Web Engine has reached augmentation limits for memory usage.

No service disruption. Monitor device to see if it exceeds service threshold limits and add more devices if necessary.

Alarm 9000012 (aug_session_exceeded) Maximum augmentation concurrent session threshold exceeded

severity=minor

Web Engine service has reached augmentation limits for concurrent sessions.

No service disruption. Monitor device to see if it exceeds Web Engine threshold limits and add more devices if necessary.

Windows Media Streaming Augmentation Alarms

The following augmentation alarm is used for Windows Media Streaming:

Alarm 511014 (wmtaugmentexceeded) Windows Media Streaming has reached augmentation threshold limits.

severity=minor"

Windows Media Streaming has reached augmentation limits on maximum concurrent connections or allowed bandwidth.

No service disruption. Monitor device to see if it exceeds Windows Media Streaming threshold limits and add more devices if necessary.

Useful commands are the show wmt and show statistics wmt usage commands.

Movie Streamer Augmentation Alarms

The following augmentation alarm is used for Movie Streamer:

Alarm 511016 (msaugmentexceeded) Movie Streamer has reached augmentation threshold limits.

severity=minor

Movie Streamer has reached augmentation limits on maximum concurrent connections or allowed bandwidth.

No service disruption. Monitor device to see if it exceeds Movie Streamer threshold limits and add more devices if necessary.

Useful commands are the show movie-streamer and the show statistics movie-streamer all commands.

Flash Media Streaming Augmentation Alarms

The following augmentation alarm is used for Flash Media Streaming:

Alarm 511015 (FmsAugThreshold) Flash Media Streaming has reached augmentation threshold limits.

severity=minor"

Flash Media Streaming has reached augmentation limits on maximum concurrent connections or allowed bandwidth.

No service disruption.Monitor device to see if it exceeds Flash Media Streaming threshold limits and add more devices if necessary.

Useful commands are the show flash-media-streaming and the show statistics flash-media-streaming connections commands.

Example

Maximum concurrent connections have a default value of 200 and maximum bandwidth has a default value of 200 Mbps. The augmentation alarm is enabled through the Service Monitor and the augmentation threshold is configured at 80 percent (default). The default service threshold for Flash Media Streaming is 90 percent.

In this case, the augmentation alarm is raised for Flash Media Streaming when 0.8 * 0.9 * 200 = 144 connections or 144 Mbps of bandwidth is exceeded. The Service Router still redirects requests to this Service Engine. The alarm is cleared when the traffic falls below either of the thresholds; that is, 144 connections or 144 Mbps in this example.

Configuring the Augmentation Alarms

The Service Monitor Augmentation Alarms are disabled by default. To enable the augmentation alarms, enter the service-router service-monitor augmentation-alarm enable command.

To disable the augmentation alarms, enter the no service-router service-monitor augmentation-alarm enable command.

The augmentation alarms threshold is a percentage, that applies to the CPU, memory, kernel memory, disk, disk fail count, NIC, and protocol engine usages. By default it is set to 80 percent.

As an example of an augmentation alarm, if the threshold configured for CPU usage is 80 percent, and the augmentation threshold is set to 80 percent, then the augmentation alarm for CPU usage is raised when the CPU usage crosses 64 percent.

If "A" represents the Service Monitor threshold configured, and "B" represents the augmentation threshold configured, then the threshold for raising an augmentation alarm = (A * B) / 100 percent.

The threshold value range is 1-100. The following command sets the augmentation alarms threshold to 70 percent:

Device(config)# service router service-monitor threshold augmentation 70
 
   

The following command resets the augmentation alarm threshold to the default:

Device(config)# no service router service-monitor threshold augmentation 70
 
   

The show service-router service monitor command displays the augmentation alarm threshold configuration.

The show alarms command displays the alarms output.

The show alarms history detail command displays the history details.

The show alarms detail command displays the alarms details.

The show alarms detail support command displays the support information.

System Requirements

The Internet Streamer CDS runs on the CDE100, CDE200, CDE205, and the CDE220 hardware models. Table 3 lists the different device modes for the Cisco Internet Streamer CDS software, and which CDEs support them.

Table 3 Supported CDEs

Device Mode
CDE100
CDE200
CDE205
CDE220-2G2
CDE220-2S3i

CDSM

Yes

No

Yes

No

No

SR

Yes

Yes

Yes

Yes

No

SE

Yes

Yes

Yes

Yes

Yes

SR—Proximity Engine standalone

No

No

Yes

Yes

No


Release 2.5.11 supports the CDE220-2S3i platform. There are a total of 14 gigabit Ethernet ports in this CDE. The first two ports (1/0 and 2/0) are management ports. The remaining 12 gigabit Ethernet ports can be configured as two port channels. See the Cisco Content Delivery Engine CDE205/220/420 Hardware Installation Guide for set up and installation procedures for the CDE220-2S3i and the Cisco Internet Streamer CDS 2.5 Software Configuration Guide for information on configuring the Multi Port Support feature.

The CDE220-2G2 platform has a total of ten gigabit Ethernet ports. The first two ports (1/0 and 2/0) are management ports. The remaining eight gigabit Ethernet ports can be configured as one port channel. See the Cisco Content Delivery Engine CDE205/220/420 Hardware Installation Guide for set up and installation procedures for the CDE220-2G2.

The CDE100 can run as the CDSM, while theCDE200 can run as the Service Router or the Service Engine. See the Cisco Content Delivery Engine CDE100/200/300/400 Hardware Installation Guide for set up and installation procedures for the CDE100 and CDE200.

The CDE205 can run as the CDSM, Service Router, or Service Engine. See the Cisco Content Delivery Engine CDE205/220/420 Hardware Installation Guide for set up and installation procedures for the CDE205.


Note For performance information, see the release-specific performance bulletin.


Limitations and Restrictions

This release contains the following limitations and restrictions:

There is a 4 KB maximum limit for HTTP request headers. This has been added to prevent client-side attacks, including overflowing buffers in the Web Engine.

Standby interface is not supported for Proximity Engine. Use port channel configuration instead.

There is no network address translation (NAT) device separating the CDEs from one another.

Do not run the CDE with the cover off. This disrupts the fan air flow and causes overheating.


Note The CDS does not support network address translation (NAT) configuration, where one or more CDEs are behind the NAT device or firewall. The workaround for this, if your CDS network is behind a firewall, is to configure each internal and external IP address pair with the same IP address.

The CDS does support clients that are behind a NAT device or firewall that have shared external IP addresses. In other words, there could be a firewall between the CDS network and the client device. However, the NAT device or firewall must support RTP/RTSP.


The matchRule element in the Manifest file is only supported for HTTP; you cannot use FTP and use the matchRule element.

Important Notes

To maximize the content delivery performance of a CDE200, CDE205, or CDE220, we recommend you do the following:

1. Use port channel for all client-facing traffic.

Configure interfaces on the quad-port gigabit Ethernet cards into a single port-bonding interface. Use this bonding channel, which provides instantaneous failover between ports, for all client-facing traffic. Use interfaces number 1 and 2 (the two on-board Ethernet ports) for intra-CDS traffic, such as management traffic, and configure these two interfaces either as standby or port-channel mode. Refer to the Cisco Internet Streamer CDS 2.4 Software Configuration Guide for detailed instruction.

2. Use the client IP address as the load balancing algorithm.

Assuming ether-channel (also known as port-channel) is used between the upstream router/switch and the SE for streaming real-time data, the ether-channel load balance algorithms on the upstream switch/router and the SE should be configured as "Src-ip" and "Destination IP" respectively. Using this configuration ensures session stickiness and general balanced load distribution based on clients' IP addresses. Also, distribute your client IP address space across multiple subnets so that the load balancing algorithm is effective in spreading the traffic among multiple ports.


Note The optimal load-balance setting on the switch for traffic between the Content Acquirer and the edge Service Engine is dst-port, which is not available on the 3750, but is available on the Catalyst 6000 series.


3. For high-volume traffic, separate HTTP and WMT.

The CDE200, CDE205, or CDE220 performance has been optimized for HTTP and WMT bulk traffic, individually. While it is entirely workable to have mixed HTTP and WMT traffic flowing through a single CDE200 simultaneously, the aggregate performance may not be as optimal as the case where the two traffic types are separate, especially when the traffic volume is high. So, if you have enough client WMT traffic to saturate a full CDE200, CDE205, or CDE220 capacity, we recommend that you provision a dedicated CDE200 to handle WMT; and likewise for HTTP. In such cases, we do not recommended that you mix the two traffic types on all CDE servers which could result in suboptimal aggregate performance and require more CDE200, CDE205, or CDE220 servers than usual.

4. For mixed traffic, turn on the HTTP bitrate pacing feature.

If your deployment must have Streamers handle HTTP and WMT traffic simultaneously, it is best that you configure the Streamer to limit each of its HTTP sessions below a certain bitrate (for example, 1Mbps, 5Mbps, or the typical speed of your client population). This prevents HTTP sessions from running at higher throughput than necessary, and disrupting the concurrent WMT streaming sessions on that Streamer. To turn on this pacing feature, use the HTTP bitrate field in the CDSM Delivery Service GUI page.

Please be aware of the side effects of using the following commands for Movie Streamer:

Config# movie-streamer advanced client idle-timeout <30-1800>
Config# movie-streamer advanced client rtp-timeout <30-1800>
 
   

These commands are only intended for performance testing when using certain testing tools that do not have full support of the RTCP receiver report. Setting these timeouts to high values causes inefficient tear down of client connections when the streaming sessions have ended.

For typical deployments, it is preferable to leave these parameters set to their defaults.

5. For ASX requests, when the Service Router redirects the request to an alternate domain or to the origin server, the Service Router does not strip the .asx extension, this is because the .asx extension is part of the original request. If an alternate domain or origin server does not have the requested file, the request fails. To ensure requests for asx files do not fail, make sure the .asx files are stored on the alternate domain and origin server.

Open Caveats

The open caveats section has the following subsections:

Open Caveats in Release 2.5.11-b26

Open Caveats in Release 2.5.11-b26

Open Caveats in Release 2.5.11-b19

Open Caveats in Release 2.5.11-b18

Open Caveats in Release 2.5.11-b15

Open Caveats in Release 2.5.11-b13

Open Caveats in Release 2.5.11-b26

There are no new open caveats for 2.5.11-b26.

Open Caveats in Release 2.5.11-b21

There are no new open caveats for 2.5.11-b21.

Open Caveats in Release 2.5.11-b20

There are no new open caveats for 2.5.11-b20.

Open Caveats in Release 2.5.11-b19

There are no new open caveats for 2.5.11-b19.

Open Caveats in Release 2.5.11-b18

There are no new open caveats for 2.5.11-b18.

Open Caveats in Release 2.5.11-b15

There are no new open caveats for 2.5.11-b15.

Open Caveats in Release 2.5.11-b13

This release contains the following open caveats:

Windows Media Streaming

CSCts22407

Symptom:

Windows Media Streaming backend process caused a core dump file to be created during post-processing of a VOD fast-forward or rewind request.

Conditions:

Only happens in VOD pass-through logic when a client is sending a fast-forward or rewind request. Windows Media Streaming front-end process received an end-of-stream (EOS) message during the front-end process or post-process pausing message.

Workaround:

None.

CDSM

CSCtq59730

Symptom:

SE goes offline when enabling Fast SE offline detection.

Conditions:

This issue can be triggered by changing the UDP port on the CDSM GUI page.

Workaround:

Restart the CMS service on the CDSM.

Acquisition and Distribution

CSCto91729

Symptom:

MetaReceiver process core dumps.

Conditions:

Because of a timing issue, the resources being accessed unsafely within MetaReceiver process.

Workaround:

None. However, node-mgr restarts the meta-receiver smoothly after the core dump. Minimum impact to the service.

Unified Kernel Streaming Engine (UKSE)

CSCto75362

Symptom:

After Windows Media Streaming live client stops under stress conditions, the show statistics wmt streamstat command may show a few remaining session of incoming and outgoing for another 15 to 20 minutes.

Conditions:

It happens for Windows Media Streaming live, with lots of client coming and leaving quickly.

Workaround:

None. Low impact, a few stale session hang around for an extra 15 to 20 minutes.

Web Engine

CSCtn74299

Symptom:

The Web Engine generates a core dump in a particular scenario.

Conditions:

High stress Windows Media Streaming HTTP traffic is running, and Windows Media Streaming threshold is exceeded. This causes the Windows Media Streaming process to not accept the Web Engine HTTP forwarded request, and can cause Web Engine to core dump.

Workaround:

With SR in the scenario and the Web Engine threshold set appropriately the service threshold alarm is raised, and no more request reach the SE. In this case, this issue would not be seen.

CSCtn70651

Symptom:

The Web Engine crashes and the existing sessions are terminated. The process is restarted immediately and subsequent requests are handled seamlessly.

Conditions:

This occurs when a URL request is 2048 characters or longer and the request is handled by the Web Engine custom log format with both %r (to print the request first line) and %U (to print the URL) in the format string.

Workaround:

Use Apache or Extended-Squid transaction logging formats, or configure custom transaction logging with either %r or %U (including both %r and %U prints redundant information).

CSCtj71423

Symptoms:

Web Engine experiences read time outs from the Authorization Server during an 8-hour, all unique, cache-fill test.

Conditions:

This occurred in a three-tier topology with a Content Acquirer, middle tier, and edge SE all configured on CDE220-2S3 platforms. The transactions per second were around 50 to 60. The testing used all unique cache-fill content with one Spirent client port and 1 Spirent Server port. The file size was set to 500 KB. The test lasted eight hours.

10/23/2010 16:25:31.207(Local)(8159)ERRO:AuthSvrQuery.cpp:30-> Time out occurred with 
authsvr read
10/23/2010 16:25:31.207(Local)(8159)ERRO:HTTPCacheAppCtxt.cpp:1510-> WorkerPid[8454] 
HTTPCacheApp[0xeef02968] :  AppCtxt(0xe86a2158) Auth Server Query Error (-1), 
AuthSvrQuery(0xe869bc08)
10/23/2010 16:25:31.207(Local)(8159)ERRO:HTTPCacheAppCtxt.cpp:1633-> WorkerPid[8454] 
HTTPCacheApp[0xeef02968] :  AppCtxt(0xe86a2158) - Received Error (500) - Complete
 
   

Workaround:

This happens under stress and a long longevity test. Current read time is two seconds and Web Engine treats it as an internal error.

CSCth22448

Symptom:

Zeri VOD playback fails in a particular scenario.

Conditions:

The per-delivery service pacing is set to 1 Mbps and there is two-tier setup for the SEs.

Workaround:

Increase the pacing to 50 or 100 Mbps.

Cache Router

CSCtr05823

Symptom;

Cache Router dumps core.

Conditions:

Happens sometimes when there is a connection issue with upstream device.

Workaround:

None. Low impact. Request is still served.

CSCtj25001

Symptom:

The Cache Router goes into core dump during Web Engine small-objects stress testing.

Conditions:

This occurs in a two-tier setup (Client->Edge->Acq->OS) with all unique cache-miss stress, running for about a day. The transactions per second was 200.

Workaround:

Minimal service impact. Self-correcting in seconds.

Proximity Engine

CSCtc20212

Symptom:

The following messages can be seen on a neighbor router when the BGP password is unconfigured on Proximity Engine, after the BGP adjacency has been formed, but corresponding removal is not performed on the router:

*Feb  7 03:32:14.861: %TCP-6-BADAUTH: No MD5 digest from 192.168.82.33(179) to 
192.168.82.2(24018)
*Feb  7 03:34:00.573: %TCP-6-BADAUTH: No MD5 digest from 192.168.82.33(179) to 
192.168.82.2(24018) (RST)
 
   

Conditions:

This issue occurs when adjacency is established with a neighboring router and the password is removed from Proximity Engine configuration and not re-configured within the hold time. Occurred in Release 2.5.3, as well as Release 2.5.9.

Workaround:

When the password is unconfigured on the Proximity Engine side, the two peers cannot communicate with each other. This state is reported on the router side with the following repeated messages:

*Feb  7 03:32:14.861: %TCP-6-BADAUTH: No MD5 digest from 192.168.82.33(179) to 
192.168.82.2(24018)
 
   

This occurs until the TCP connection is closed on Proximity Engine side and enters TIME_WAIT state. While this state lasts, no messages are printed on the router. The router is still retransmitting TCP packets, but the Proximity Engine is ignoring them, as per TIME_WAIT state. After about 60-75 seconds, the following messages start to display on the router:

*Feb  7 03:37:32.937: %TCP-6-BADAUTH: No MD5 digest from 192.168.82.33(179) to 
192.168.82.2(24018) (RST)
 
   

These indicate that the TCP connection has been completely closed on the Proximity Engine side, which therefore no longer has any knowledge of the TCP connection and responds to each retransmitted packet with an RST packet, which does not have an MD5 signature. This situation is described in RFC 2385, section 4.1 (Connectionless Resets). The messages are logged as long as the router retransmits TCP packets of the lost connection, which has been observed to occur for up to ten minutes. This issue does not affect correct operation.

Ucache

CSCti46019

Symptom:

The Ucache process goes into core dump when the memory usage reaches close to 4 GB, no service failure occurs during this period, just a core file is generated on the SE.

Conditions:

The core dump happens when the Ucache process memory usage reaches close to 4 GB. This can happen for the following reasons:

SE has large amounts of content with large URLs.

The clear cache all command was entered when there are many content files.

Occurred in Release 2.5.3, as well as Release 2.5.9.

Workaround:

If the number of content objects in the SE is not large and the URL size is small, then this core dump can be avoided. Maximum cache object count can be set by using the cache content max-obj-count command.

CSCtj76113

Symptom:

The error messages logged in the Ucache logs when there is stress and eviction in progress.

Conditions:

The error message occurred when the Ucache process was in a stressed environment with eviction in progress. The eviction resulted in sending an RPC to itself during the eviction. When there are many RPC messages coming into the Ucache process, the RPC can time out.

Workaround:

Too many deletion operations are the root cause for this error message. If the maximum object count is small, this issue can be avoided.

Service Router

CSCtf67735

Symptom

Memory usage of Service Router increases a lot and leads to core dump.

Conditions:

We see the memory leak when the following conditions are true:

There is only one SE assigned to the delivery service and the SE becomes unavailable (reloaded or offloaded).

There are a lot of requests from HTTP 1.1 where there are multiple requests within the same HTTP session.

Workaround:

None.

CSCtj83262

Symptom:

The Service Router sometimes goes into a core dump after uploading 4 MB Coverage Zone file.

Conditions:

The Coverage Zone file is too large (28,000 entries), with 10 delivery services, and multiple SEs. Occurred in Release 2.5.3, as well as Release 2.5.9.

Workaround:

Reduce one of the three: number of Coverage Zone file entries, number of delivery service, or number of SEs per location.

MP3 Live Streaming

CSCtk66500

Symptom:

Web Engine goes into core dump on the edge SE or middle SE, when the origin server or the Content Acquirer restarts during playback.

Conditions:

During the MP3-live playback, restart the Web Engine on the Content Acquirer or stop the encoder process on origin server. When the origin server restarts, all the SEs go into core dump. When the Web Engine restarts on the Content Acquirer, the middle SE and edge SE go into core dump.

Workaround:

Web Engine restarts by itself.

Resolved Caveats

The caveats listed in this section have been resolved since Cisco Internet Streamer CDS Release 2.5.11. Not all the resolved issues are mentioned here. The following list highlights the resolved caveats associated with customer deployment scenarios. The resolved caveats section has the following subsections:

Resolved Caveats in Release 2.5.11-b26

Resolved Caveats in Release 2.5.11-b21

Resolved Caveats in Release 2.5.11-b20

Resolved Caveats in Release 2.5.11-b19

Resolved Caveats in Release 2.5.11-b18

Resolved Caveats in Release 2.5.11-b15

Resolved Caveats in Release 2.5.11-b13

Resolved Caveats in Release 2.5.11-b26

Table 4 lists the issues resolved in the Cisco Internet Streamer CDS 2.5.11-b26 release.

Click on the bug ID to view the bug details. This information is displayed in the Bug Toolkit.

Table 4 Resolved Caveats in Cisco Internet Streamer CDS 2.5.11-b26 Release 

Bug ID
Description

CSCud09062

Need to lower the level of syslog message "HTEndloop"

CSCud24328

CDSM GUI hangs when downgrade from 3.1.2-b7 to 2.6.3-b26

CSCuc45864

SR doesn't timeout connections who do not send any requests


Resolved Caveats in Release 2.5.11-b21

The following caveats have been resolved since Cisco Internet Streamer CDS Release 2.5.11-b21:

CDMs

CSCuc67508

Symptom:

rpc_httpd (the Apache process) initiated a core dump when reloading.

Conditions:

When the device is reloading, the rpc_httpd process is exiting.

RTSP

CSCuc32752

Symptom:

Alarm is not clearing despite rtspg logs showing that there is no traffic hitting the box.

Conditions:

These were generated during a change where 11 Internet Streamers were removed from a custom Device Group and assigned to the "BASELINE" Device Group.

Transaction Logging

CSCuc59183

Symptom:

ftp_export stalls. Debugging with GDB; it waits for read in the socket.

Conditions:

The sftp-server (SSH) does not response to the ftp_export and does not break the connection.

UNS

CSCuc41245

Symptom:

A script is needed to delete all the fragment files for a URL path.

Usage:

delete_files_url_with_path.sh "<URL path>".

Web Engine

CSCuc46365

Symptom:

Web Engine failed to copy the HCACHE header or perform a new content route lookup for requests with the query string or HEAD only requests, which are bypassed up front.

Conditions:

This happens only when the file is already cached on disk, requires revalidation and new requests are either query string or HEAD requests.

CSCuc49835

Symptom:

Web Engine in the edge SE bypasses CA and reaches OS directly.

Conditions:

1. Edge SE, which is serving the client directly.

2. HTTP Head request.

3. The requesting content is cache-hit and expired and need revalidation.

4. There is an existing Datasource for the requesting URL but no existing DataSourceFinder for the request URL.

WMT

CSCub98887

Symptom:

When a customer changes the encoder resolution of the video output, the cudtomer receives a garbled video image and has to restart the live program through the CDSM GUI.

Conditions:

The customer changes the encode resolution of the video output.

Resolved Caveats in Release 2.5.11-b20

The following caveats have been resolved since Cisco Internet Streamer CDS Release 2.5.11-b20:

Service Routing

CSCub41457

Symptom:

Client has been redirected to 302 Last Resort while one SE device becomes offline. Supposedly, SR should pick up the other SE.

Conditions:

SE becomes offline when the Service-Monitor process hangs.

Authorization Server

CSCub59480

Symptom:

Two IP addresses are configured to an SE, one of the IP addresses maps to a delivery service and the other is still able to play the stream of the delivery service.

Conditions:

The client configures the other IP address of the SE in the client host file, so the client can play the stream.

SVC Monitor

CSCub41468

Symptom:

When service_monitor process is hung, the SE state inside the SR is flapping. The interval of UDP packet for keepalive is increased from 2 seconds to 8 seconds. It is because there are multiple SR devices in production network. Each query adds extra delay.

Conditions:

SE service_monitor process is hung.

Resolved Caveats in Release 2.5.11-b19

The following caveats have been resolved since Cisco Internet Streamer CDS Release 2.5.11-b19:

Service Routing

CSCub41474

Symptom:

The ability of the content based routing feature is limited when URL signing is used. Content based routing generates a hash on the whole URL, which includes the signed part and therefore is always unique. The SR redirects the client to multiple streamers for the same content

Conditions:

When URL signing feature is used.

Streamer

CSCua92633

Symptom:

Stream-schedule core dump on a customer device.

Conditions:

There is no corresponding record found from the (DB) table uni_multicast_info, but we have a logic issue for this case.

Web Engine

CSCub42175

Symptom:

Clients receive a 504 error when streaming WMT.

Conditions:

There is a race condition in the Web Engine that when OS times out, it may cause all subsequent client requesting this movie to receive a 504 error.

Web Services

CSCub33092

Symptom:

Based on the investigation over the error logs and the Web Engine code, the root cause is:

1. Some existing data sources in the memory ended up with a bad status because of content downloading being terminated by OS HTTP response body read timeout 504 (because the OS is overloaded).

2. At the same time, many new requests kept hitting the existing data sources.

3. Before using those data sources, there is no status check; therefore, 500 were returned to the client.

4. Since new requests kept hitting the data sources, the data sources could not be evicted fr.om the memory.

Conditions:

2.5.11.15 Origin server is under heavy load.

WMT

CSCtx23762

Symptom:

WMT streams are affected.

Conditions:

Incoming bytes do not increment for that stream.

Resolved Caveats in Release 2.5.11-b18

The following caveats have been resolved since Cisco Internet Streamer CDS Release 2.5.11-b18:

CDM

CSCtz54312

Symptom:

Customer need to setup cookie that include "=" and ":"

Conditions:

None.

Proximity

CSCtt22355

Symptom:

BGP process does not start after config from CLI or CDSM.

Conditions:

This happens only if i-node count is very high. The i-node count can be checked by lsof. The utility netstat will show i-node INT_MAX for some processes, which is the root cause.

URL Manager

CSCua77316

Symptom:

Windows Media (http), SE sometimes denies request.

Conditions:

1. URL sign generation is enabled through authsvr rule file.

2. URL sign verification is enabled through CLI rule configuration.

Web Engine

CSCtz14361

Symptom:

Cache control header smaxage is not cached correctly on the SE.

Conditions:

Cache control header smaxage is configured on the Origin Server.

CSCtz50130

Symptom:

Custom access logs (transaction logs) shows large value in the column which corresponds to the "Bytes-Transferred-Excluding-Header" when no data is sent out to the downstream SE.

Conditions:

Only under overloaded conditions when the SE or CA takes too long to send the data out to the downstream SE. But the downstream SE (after waiting for a certain time; default= 5secs), disconnects from the upstream SE and sends a 504-Gateway timeout to the client.

CSCua21224

Symptom:

SE sends HTTP connections directly to Origin Server bypassing CA.

Conditions:

Code upgrade from 2.5.9 to 2.5.11.

CSCua52103

Symptom:

Web Engine core dump.

Condition:

1. There is traffic ongoing during the Web Engine startup.

2. Thread transfer happens.

Resolved Caveats in Release 2.5.11-b15

The following caveats have been resolved since Cisco Internet Streamer CDS Release 2.5.11-b15:

Movie Streamer

CSCtj90170

Symptom:

Movie Streamer engine core dumps on SE.

Conditions:

During Movie Streamer initialization of a unicast-in live program, the SDP enclosed in the DESCRIBE response from the Origin server does not contain any valid stream-level metadata. This may happens if the Origin server is Wowza server.

SNMP

CSCty80254

Symptom:

High number of SNMP traps flooding the monitoring station.

Conditions:

Internal configuration change notifications, possibly because of misconfiguration; for example, if a delivery service has an invalid URL, configuration traps are sent in an endless loop.

URL Manager

CSCtz01817

Symptom:

If the input URL is externally signed and passed to the URL Manager, the internal-sign process replaces the question mark (?) with the ampersand (&), which creates a false URL.

Conditions:

The problem occurs when a client request has an externally-signed URL.

Web Engine

CSCty37006

Symptom:

When end-user changes channels, they start at the same point within the video (between 3 -10 minutes), the Manifest file is not updating as frequently as it should (every second or so).

Conditions:

Occurs when streaming adaptive bit rate (ABR) content.

CSCtq67894

Symptom:

The Web Engine core dumps after enabling range cache-fill during stress for large file, or the SE runs out of connection or memory and errors such as the following are seen:

Web Engine Concurrent sessions exceeds threshold value
Session count (x) reached session threshold (30000) or Memory Usage (x) higher than 
(3435973836) for FD 
 
   

Conditions:

The Web Engine crashes on failover scenario for large file range request, or Web Engine no longer accepts any new connections because connection limit is reached (connections are stuck in CLOSE_WAIT state).

CSCts99053

Symptom:

The following symptom were seen for this issue:

All the four SRs and one backup CDSM were reported to be down on the primary CDSM. Only these five devices were observed to flip between online and offline modes while the SEs states seem to be okay.

There were no reported interruption to the end user. But the CDN monitoring system (CDSM) is reported to be unreliable.

Huge /local/local1/logs/rpc_httpd/ssl_scache.pag file size (around 44 GB) on the primary CDSM.

No core files observed.

Conditions:

The SRs, SEs, and backup CDSM send HTTP and HTTPS messages to the primary CDSM. The messages are handled by the rpc_httpd process on the CDSM. These requests are the HTTP and HTTPS messages that report the health of the various nodes to the CDSM.

Apache (rpc_httpd) uses ssl_scache.pag file to speed up parallel request processing by avoiding unnecessary session handshakes. At every SSLSessionCacheTimeout interval the global/inter-process SSL Session Cache information is timed out, with the httpd process acquiring a lock and traversing the records. Because of the size of the file (approximately 44 GB), this operation is taking an excessively long time, which blocks other processes from reading the file for session information.

Because the Fast SE Offline Detection is enabled, the SEs health is communicated to the CDSM using UDP messages (and not the HTTP/HTTPS mechanism). This corresponds to what was observed, where only the backup CDSM and the SRs were offline, while the SEs were reported to be online.

This issue happens once every six months with more than 50 SEs, 4 SRs, and 1 backup CDSM communicating with the primary CDSM using SSL.

Live Routing

CSCtu08478

Symptom:

Windows Media Streaming live stream request goes directly to the Origin Server from a non-Content Acquirer Service Engine.

Conditions:

Primary Content Acquirers were on reloading or down. Some liveness queries to the backup Content Acquirer returned failure for the Windows Media Streaming engine.

CDSM

CSCtw79243

Symptom:

The following two symptoms have been observed:

1. The rea agent stops running after the cms agent crashes.

2. If the rea agent is started and the show rea info command is entered at the same time, the rea agent may fail to start.

Conditions:

For symptom 1, it happens each time the cms agent crashes.

For symptom 2, it is a rare case.

HTTP Stress

CSCtx41490

Symptom:

Custom transaction logs are printed continuously without a new line character between log entries.

Conditions:

Work load is high, stressed.

Authorization Server

CSCty31360

Symptom:

Clients received 403 response from SE.

Conditions:

SE failed to reach the Geo-location server when Geo-location API call failed.

Cache Router

CSCty11856

Symptom:

Cache Router does not take into account all the Content Acquirers and SEs in a location for its route calculations. It only picks the first two (in the dynamic hash list of SEs generated for every URL) and sends liveness queries only to those two. If both of them fail to respond or respond with "unusable" state, the Cache Router skips that location and goes upstream (it could be another SE or directly to the Origin server accordingly).

If two Content Acquirers are down or offloaded in a root location, the SEs could be directly contacting the Origin server for some of the URLs which bypasses the other Content Acquirers in the root location.

Conditions:

Only when at least two of the Content Acquirers in the root location are offloaded or down and there are other live Content Acquirers in the root location.

Platform

CSCty72907

Symptom:

Streaming issues.

Conditions:

Content Acquirer fails to send keep-alives to the Origin server.

MP3 Live

CSCtx75279

Symptom:

We do not support Response messages with header "content-type:audio/aacp."

Conditions:

Always.

Resolved Caveats in Release 2.5.11-b13

The following caveats have been resolved since Cisco Internet Streamer CDS Release 2.5.11-b13:

Windows Media Streaming

CSCtx23762

Symptom:

Windows Media Streaming streams are affected.

Conditions:

Incoming bytes do not increment for that stream.

CSCts13162

Symptom:

There are two incoming streams (ingest) for a single outgoing stream when encoder failure happens in some particular scenarios.

Conditions:

After Windows Media Streaming live program fails over from primary encoder to secondary encoder, recover the primary one, then send new request.

CSCts48554

Symptom:

The show stat wmt streamstat command output displays stale entry; the process for that stream is gone.

Conditions:

Not applicable.

CSCts23924

Symptom:

For all unique cache-miss cases, during cache-fill stage, Windows Media Streaming can only sustain about 200 concurrent users.

Conditions:

All unique cache-miss cases.

CSCts02088

Symptom:

Windows Media Streaming cached HTTP response into media data.

Conditions:

During cache fill, Windows Media Streaming hit some network issue or the connection with the Origin Server was dropped.

CSCtr72666

Symptom:

The cached data for one video contains the content from another video. Windows Media Streaming playback failed.

Conditions:

Under stress and ramp-up value is more than 20, Windows Media Streaming could generate the same session-id for two different clients in a short period of time, this causes a cache-filling error.

CSCtr77999

Symptom:

Windows Media Streaming stream statistics show a large value in the duration field.

Conditions:

Player only sends open and close requests, no play request is sent.

CSCtr71734

Symptom:

Line feeds in one of the Windows Media Streaming transaction log field causes a log entry to split across two entries.

Conditions:

The client is a Windows Media Player.

CSCtr43594

Symptom:

The wmt_ml process enters into a hang state and cannot serve requests anymore.

Conditions:

Bad cached data cause FE send stream-end message to wmt_ml continuously.

CSCtr43586

Symptom:

Some video content has a freezing issue because the Content Acquirer cached a corrupted block.

Conditions:

The Origin server is an Apache HTTP server.

CSCtr44615

Symptom:

Windows Media Streaming backend process core dumped during post-processing of a live session.

Conditions:

This issue only happens if the live source is a server- side playlist.

CSCto92496

Symptom:

After five days load testing, core.wmt_be found on device.

Conditions:

It happens for Windows Media Streaming five, when a pause event happens in a live program.

CSCti97945

Symptom:

Some alarms are not cleared, even when the issue no longer exists.

If the encoder is recovered but the live program does not get the source from it, it has no chance to clear the alarm. An example of this scenario follows:

A live program pulls the stream from encoder1; encoder2 is the backup source. When encoder1 fails there is an alarm raised and the live program switches to encoder2. The alarm was noticed and recovery to encoder1 was accomplished. The alarm is not recovered until the SE pulls the stream from encoder1 successfully.

The node health monitor framework allows one module to clear the alarm only if the module is the one who raised the alarm. CDSM only has the rights to view all alarms and clear the cdm alarms.

Conditions:

If the encoder is recovered but the live program does not get the source from it, it has no chance to clear the alarm.

CSCtq96265

Symptoms:

A single HTTP over MMS request triggers two DESCRIBE requests sent to upstream SE or Origin server.

Condition:

The media client is using HTTP 1.0 and disconnects the TCP connection after received the Describe response.

CSCtq37747

Symptom:

Backup Content Acquirers may take some flows to distribute to lower tier for live streams.

Conditions:

Live streams are configured and in Tier 1 there are multiple SEs present. Not all traffic flows through defined Content Acquirer; instead, other SEs in Tier 1 handle the traffic.

CSCtd55023

Symptom:

The wmt_mbe process generated core.

Conditions:

Rare condition.

CSCtq78085

Symptom:

The wmt_mbe process generated core.

Conditions:

While handling Windows Media Streaming traffic.

CSCtq42252

Symptom:

The Windows Media Streaming front-end connection number between the Content Acquirer and the Origin server is much higher than the number of live programs configured. Usually the number is double. The extra connections are not persistent, they are dynamic and refresh every around 60 seconds.

Conditions:

Windows Media Streaming live program is set to be primed and wmt_mbe got some error from Origin server (USS) during the RTSP DESCRIBE response process.

CSCtq09675

Symptom:

The wmt_be process generated a core dump file.

Conditions:

The Origin server sent an announce message to the SE.

CSCto41489

Symptom:

The cs-url process logs the original URL for the request, not the SR redirected URL.

Conditions:

When the request is redirected from an SR.

CSCtn93441

Symptom:

The cs-uri-stem limit is 128 bytes, so long string values are truncated.

Conditions:

When cs-uri-stem is longer than 128 characters.

Web Engine

CSCtq87366

Symptom:

The data server uses 100 percent of the CPU.

Conditions:

This is because of the fd leak when unbind fails. Add retry logic to fix it.

CSCtq10369

Symptom:

The Content-Length header results in httprequestreader to continue reading for head request.

Conditions:

The Content-Length header results in httprequestreader to continue reading for head request.

CSCtq65413

Symptom:

Nessus security scan caused Web Engine to go into a loop.

Request sample:

127.0.0.1 TCP_MISS/504 231 HEAD http://0.0.0.0/ application/octet-stream

Conditions:

This seems to be the result of one of the Nessus tests with a bad HTTP host header of 0.0.0.0.

CSCtq68900

Symptom:

An error condition occurs in which datasourcefinder is not cleaned up.

Conditions:

When a query URL is encoded, standard decoding happens.

Flash Media Streaming

CSCtq38416

Symptom:

Log parsing and analysis failed on Sawmill and other third-party log analysis tools.

Conditions:

Using Sawmill FMS log parsing module.

Authorization Server

CSCtq59885

Symptom:

In the Service Rule XML file, Rule_Allow is configured with multiple matchGrps, none of the matchGrps match, and there is another Action configured following the Allow actions, the request is incorrectly allowed.

Conditions:

With multiple matchGrps in the Rule_Allow, when none of the matchGrps match and there is another action following the Allow action, the request is incorrectly allowed.

CSCto80450

Symptom:

Changing the primary or secondary Geo-location servers configuration does not take affect immediately.

Conditions:

Change primary or secondary Geo-location IP address on the SE configuration.

Network

CSCtg91790

Symptom:

An issue only happens on CAT6K. What happens here is if our fiber box is connected to CAT6K fiber module, when you shutdown Gige 10/0, the switch side always shows the interface is up but the SE side shows interface is down and network is not reachable. If we shutdown Gige 9/0, there is no issue, both switch and SE sides show interface is down and port-channel continues to work. I am not sure what is so special about the Gig10/0, it is PCI-x NIC but Gig9/0 is also on the same NIC card.

Conditions:

Release 2.5.3.b15 and using CAT6K.

CSCto32852

Symptom:

The error condition triggers the core dump and causes the dataserver become out of sync.

Conditions:

The condition happens when the interface portchannel 1bandwidth 100 command and the interface portchannel 1 bandwidth 1000 command are entered, and then the SE is rebooted.

HTTP Core

CSCtq76014

Symptom:

Dataserver bind fails in Web Engine even after 10 retries with 3-second retry intervals.

Conditions:

Dataserver bind fails in Web Engine even after 10 retries with 3-second retry intervals.

CSCtn50177

Symptom:

HTTP connection is closed after the content gets served, even if there is "Connection: Keep-Alive" in HTTP/1.0 request.

Conditions:

HTTP/1.0 and "Connection: Keep-Alive" exists

Large content gets served successfully

HTTP ABR

CSCtq22406

Symptom:

APPLE HLS streaming chunks take a very long time to get served, in minutes. The show statistics web-engine detail command shows a very large number of outstanding CAL updates.

Conditions:

Because of the backlog of disk operations, a disk file creation takes a long time, then because of the backlog of the file creations, it takes a very long time to serve any streaming chunk.

URL Manager

CSCto84838

Symptom:

URL signature validation fails since the client browse; that is, the android, cannot understand a colon (:).

Conditions:

When a signed UEL contains a colon (:), the android browser cannot handle it.

CDSM

CSCsq35576

Symptom:

After the standby CDSM is switched to primary, running the show cms info command on the SE or SR still shows the original primary CDSM's IP address as the "Current CDSM Address."

Conditions:

Switching from primary to standby CDSM.

CSCtn73041

Symptom:

CDSM graphs are not showing correct values.

Conditions:

After CDSM role is changed.

CSCtl19932

Symptom:

Core generated in MetaDataReceive.

Conditions:

When the table is dropped and the Acquisition and Distribution component wants to access the particular table. It is mostly triggered by entering the cms recover identity command.

CSCtn30873

Symptom:

Java core dump found on the CDSM.

Conditions:

Logging into the CDSM and password not provided.

Stream Scheduler

CSCto43865

Symptom:

The show programs command reports the wrong live program status.

The show programs command reports "Failed to start program (UNS resolve fails)" or "Failed to start program (WMT API failed to start program)."

Conditions:

When the live program is working correctly.

Transaction Logs

CSCts70947

Symptom:

Transaction logs are not rolling properly on cds-is devices.

Conditions:

When the transaction logs file sequence number roll over.

CSCto55259

Symptom:

For transaction logs, the rotation does not work for compressed files.

Conditions:

When configuring the transaction-logs export compression.

UNS

CSCto93894

Symptom:

When UNS has inconsistent entries an alarm is raised, The complete alarm name, unsinconsistententries, is not displayed in the output of the show alarms command, because alarm name is too long.

Conditions:

Functionality works fine. This is just a cosmetic issue where complete alarm name string is not displayed.

CSCto99601

Symptom:

The alarm "unsinconsistentetries" has been raised. It can be seen in the output of show alarms command.

Conditions:

This alarm is generally raised when an internal UNS journal file used by UNS process gets corrupted when UNS starts. It can also be caused when there is an inconsistency between total content count between two internal processes: UNS and Ucache.

CSCtf37689

Symptom:

The UNS server goes into core dump after a device reload (after running Flash Media Streaming mixed traffic).

Conditions:

When running Flash Media Streaming mixed performance testing(70-20-10: 70percent all unique, 20 percent single unique, 10 percent cache-miss) traffic. Occurred in Release 2.5.3, as well as Release 2.5.9.

Geo-Location Server

CSCtn58091

Symptom:

The core.service_router generated on the Service router.

Conditions:

Number of errors("QUOVA_RETURN_FAILURE") being returned from the quova server.

Service Router

CSCts69578

Symptom:

No external symptom. The service monitor process makes an RPC call to the Service Router process even when the Service Monitor transaction log is not enabled (service-router service-monitor transaction-log enable).

Conditions:

Always.

CSCtr44615

Symptom:

Some of the HTTP requests sent to the Service Router take a long time to get a response.

Conditions:

Number of requests sent to the Service Router is high.

CSCtq45818

Symptom:

The Service Router is wrongly matching network prefixes x.y.0.0/24 when not expected, treating them as if they are x.y.0.0/16.

Conditions:

This was observed when offloading a Service Engine associated with prefix x.y.z.0/24 in the Coverage Zone File. The Service Router started to match the entry for prefix x.y.0.0/24 after that.

CSCto34713

Symptom:

The SE shows as active in the show service-router services command output and the show service-router routes command output when the Service Engine is not sending keepalive messages to the Service Router. No SE keepalive alarms are seen for this down Service Engine on the Service Router.

Conditions:

All the streaming interfaces and the primary interface on the Service Engine are shut down. The keepalive interval is changed.

CSCtl93373

Symptom:

The fmsdge process core dumps sometimes on the SR when a stress test using RTMPT is running.

Conditions

When a very high or uncontrolled ramp up rate is used in the load tool and more RTMPT connections are sent than what is the maximum configured, the extra connections are rejected. As a result the tool tries to send more connections leading to more connections getting rejected. As this happens the memory usage of the fmsedge process keeps increasing and it coredumps at 4GB. Occurred in Release 2.5.3, as well as Release 2.5.9.

Service Monitor

CSCtw59404

Symptom:

Device is shown as offline in the CDSM. The show service-router service-monitor command takes a lot of time and does not print all values.

Conditions:

Rare scenario.

CSCtj81042

Symptom:

Service Monitor goes into core dump.

Conditions:

During stress test. Occurred in Release 2.5.3, as well as Release 2.5.9.

Unified Kernel Streaming Engine (UKSE)

CSCtg58494

Symptom:

There were TCP retransmits when we were overrunning the link.

Conditions:

The network link must be overrun.

Ucache

CSCto88599

Symptom:

Core file seen for process ucache-svr when device is serving RTSP VOD traffic.

Conditions:

Core file seen on setup. Issue happens because of corruption of metadata of one of the asset files copied on the device. The error condition to handle this incorrect metadata was not correct.

CSCtq18571

Symptom:

A core file is generated by Ucache process in a longevity test.

Conditions:

This happens for a very rare condition when the internal alarm infrastructure sometimes does not return the alarm information for an alarm. The alarm infrastructure maybe busy or unresponsive on that occasion.

Platform

CSCtq48239

Symptom:

Seeing a rootfs alarm on a CDS device.

Conditions:

Having, for example, a misconfigured DNS server which causes error messages to fill the rootfs file system, resulting in potential instability.

Live Routing

CSCtq30703

Symptom:

In live routing debug message, it prints message to connect to an invalid IP address.

Conditions:

Normal running. The SE itself must be the live routing candidate; that is, the "se_id 0 ip 0" line shows in error log before seeing this symptom.

CSCtq33079

Symptom:

Under a high load of Windows Media Streaming live transactions, sometimes requests are rejected because no response is received by Windows Media Streaming within 30 seconds.

Conditions:

This happens when Windows Media Streaming cannot get responses from the live stream router (LSR) module, which gets a hierarchical path of Service Engines in route to the Origin server. This is because the LSR has very high connect and read timeouts when it tries to get information from upstream SEs.

CLI

CSCtn86529

Symptom:

Transaction logs are exported twice.

Conditions:

Two (S)FTP export servers are configured and the device is upgraded.

RTSP Gateway

CSCtq96723

Symptom:

No RTSP quota exceeded logging in error level log.

Conditions:

Occurs when error log level is set.

CSCto95606

Symptom:

Client request from User-Agent Lavf is not sent to WIndows Media Streaming.

Conditions:

Normal function.

Movie Streamer

CSCtr80545

Symptom:

The video quality is poor during playing content by using RTSP from the Movie Streamer.

Data Server

CSCtq78801

Symptom:

A core file is generated by the Web Engine.

Conditions:

In stress, there are many liveness queries simultaneously from other SEs.

Accessing Bug Tool kit

This section explains how to use the Bug Toolkit to search for a specific bug or to search for all bugs in a release.


Step 1 Go to http://tools.cisco.com/Support/BugToolKit.

Step 2 At the Log In screen, enter your registered Cisco.com username and password; then, click Log In. The Bug Toolkit page opens.


Note If you do not have a Cisco.com username and password, you can register for them at http://tools.cisco.com/RPF/register/register.do.


Step 3 To search for a specific bug, click the Search Bugs tab, enter the bug ID in the Search for Bug ID field, and click Go.

Step 4 To search for bugs in the current release, click the Search Bugs tab and specify the following criteria:

Select Product Category—Video.

Select Products—Cisco Content Delivery Engine Series.

Software Version—[2.5].

Search for Keyword(s)—Separate search phrases with boolean expressions (AND, NOT, OR) to search within the bug title and details.

Advanced Options—You can either perform a search using the default search criteria or define custom criteria for an advanced search. To customize the advanced search, click Use custom settings for severity, status, and others and specify the following information:

Severity—Choose the severity level.

Status—Choose Terminated, Open, or Fixed.

Choose Terminated to view terminated bugs. To filter terminated bugs, uncheck the Terminated check box and select the appropriate suboption (Closed, Junked, or Unreproducible) that appears below the Terminated check box. Select multiple options as required.

Choose Open to view all open bugs. To filter the open bugs, uncheck the Open check box and select the appropriate suboptions that appear below the Open check box. For example, if you want to view only new bugs in Prime Optical 9.5, choose only New.

Choose Fixed to view fixed bugs. To filter fixed bugs, uncheck the Fixed check box and select the appropriate suboption (Resolved or Verified) that appears below the Fixed check box.

Advanced—Check the Show only bugs containing bug details check box to view only those bugs that contain detailed information, such as symptoms and workarounds.

Modified Date—Choose this option to filter bugs based on the date when the bugs were last modified.

Results Displayed Per Page—Specify the number of bugs to display per page.

Step 5 Click Search. The Bug Toolkit displays the list of bugs based on the specified search criteria.

Step 6 To export the results to a spreadsheet:

a. In the Search Bugs tab, click Export All to Spreadsheet.

b. Specify the filename and location at which to save the spreadsheet.

c. Click Save. All bugs retrieved by the search are exported.

If you cannot export the spreadsheet, log into the Technical Support website at http://www.cisco.com/cisco/web/support/index.html or contact the Cisco Technical Assistance Center (TAC).

Upgrading to Release 2.5.11

The only supported upgrade paths are Release 2.5.x to Release 2.5.11. If you are running a release prior to Release 2.5.x, you must upgrade to at least Release 2.5.x before upgrading to Release 2.5.11.


Note Before upgrading from Release 2.5.3 to Release 2.5.11, enter the clear cache all command.

Content cached in the Release 2.5.3 Web Engine, if requested in Release 2.5.11, results in duplicate entries in the Ucache process. Duplicate entries were found in the output of the show content and show cache commands, but the disk maintains only a single copy of the content.


After the upgrade procedure starts, do not make any configuration changes until all the devices have been upgraded.


Note Release 2.5.11 only supports one IGP (IS-IS or OSPF) for the Proximity Engine. When upgrading to Release 2.5.11 from Release 2.5.1 or Release 2.5.3, if both IGPs (IS-IS and OSPF) were configured for the Proximity Engine, then one of the configurations must be removed.



Note The new Web Engine in Release 2.5.11 cannot be removed during downgrade to Release 2.5.3 because this configuration is still valid in Release 2.5.3 (the new Web Engine was supported as an EFT feature in Release 2.5.3). Therefore, both CLI commands are present after downgrading.

If user roles are defined in Release 2.5.11, and the system is then downgraded to Release 2.5.3, then the following menu options will not be accessible to the user with defined roles:

Devices > Service Engines > Service Control > ICAP

Devices > Service Engines > Service Control > ICAP Services

Devices > Service Engines > Service Control > PCMM QoS Policy

Devices > Service Engines > Application Control > Web > HTTP > HTTP Connections

Devices > Service Engines > Application Control > Web > HTTP > HTTP Caching

Devices > Service Engines > Application Control > Web > HTTP > Advanced HTTP Caching

Devices > Device Group > Service Control > ICAP

Devices > Device Group > Service Control > ICAP Services

Devices > Device Group > Service Control > PCMM QoS Policy

Devices > Device Group > Application Control > Web > HTTP > HTTP Connections

Devices > Device Group > Application Control > Web > HTTP >HTTP Caching

Devices > Device Group >Application Control > Web > HTTP > Advanced HTTP Caching

Services > Service Definition > Delivery Service > PCMM Config

If any defined user with a defined role requires access to the above menu options, then the menu options must be added by choosing System > AAA > Roles and enabling the services for those menu options.


Source Policy Routes

Release 2.5.7 supported multiple IP addresses on the CDE220-2S3i, which included specifying the default gateway and IP routes. The IP routes, source policy routes, were added to ensure incoming traffic would go out the same interface it came in on. An IP route was added using the interface keyword, which was introduced in Release 2.5.7, and has the following syntax:

ip route <dest_IP_addr> <dest_netmask> <default_gateway> interface <source_IP_addr>
 
   

In the following example, all destination traffic (IP address of 0.0.0.0 and netmask of 0.0.0.0) sent from the source interface, 8.1.0.2, uses the default gateway, 8.1.0.1. This is a default policy route.

ip route 0.0.0.0 0.0.0.0 8.1.0.1 interface 8.1.0.2
 
   

A non-default policy route defines a specific destination (IP address and netmask). The following ip route command is an example of a non-default policy route:

ip route 10.1.1.0 255.255.255.0 <gateway> interface <source_IP_addr>
 
   

When upgrading to Release 2.5.11, any source policy routes configured using the Release 2.5.7 interface keyword are rejected and are not displayed when the show running-config command is used. However, because you had to define the default gateway for all the interfaces as part of the multi-port support feature, the equivalent source policy route is automatically generated in the routing table.

The following example shows the output for the show ip route command after upgrading to Release 2.5.11 with the default source policy routes highlighted in bold and the non-default policy routes highlighted in italics:

# show ip route
 
   
Destination      Gateway          Netmask
---------------- ---------------- ----------------
172.22.28.0      8.1.0.1         255.255.255.128
6.21.1.0         0.0.0.0         255.255.255.0
8.2.1.0          0.0.0.0         255.255.255.0
8.2.2.0          0.0.0.0         255.255.255.0
171.70.77.0      8.1.0.1         255.255.255.0
8.1.0.0          0.0.0.0         255.255.0.0
0.0.0.0          8.1.0.1         0.0.0.0
0.0.0.0          8.2.1.1         0.0.0.0
0.0.0.0          8.2.2.1         0.0.0.0
 
   
Source policy routing table for interface 8.1.0.0/16
172.22.28.0      8.1.0.1         255.255.255.128
171.70.77.0      8.1.0.1         255.255.255.0
8.1.0.0          0.0.0.0         255.255.0.0
0.0.0.0          8.1.0.1         0.0.0.0 
 
   
Source policy routing table for interface 8.2.1.0/24
8.2.1.0          0.0.0.0         255.255.255.0
0.0.0.0          8.2.1.1         0.0.0.0
 
   
Source policy routing table for interface 8.2.2.0/24
8.2.2.0          0.0.0.0         255.255.255.0
0.0.0.0          8.2.2.1         0.0.0.0
 
   

If you have a default source policy route where the gateway is not defined as a default gateway, then you must add it after upgrading to Release 2.5.11. For example, if you had a source policy route with a gateway of 6.23.1.1 for a source interface of 6.23.1.12, and you did not specify the gateway as one of the default gateways, you would need to add it.

If you have a non-default source policy route, then you must add it as a regular static route (without the obsoleted interface keyword) after upgrading to Release 2.5.11. This route is then added to the main routing table as well as the policy routing table.

URL Public Key Signing

Table 5 describes the compatibility and results when using a prior CDS software release to perform URL signing and the current software release to perform URL validation.

Table 5 Release Compatibility of URL Signing and URL Validation 

Release Used for URL Signing
Release Used for URL Validation
Results

2.3.x

2.4.3, 2.4.5, or 2.5.x

Not supported because the Release 2.3.x URL signing uses the port and schema for signing, but the Release 2.5.11 URL validation removes the port.

2.4.3

2.5.11

Supported for all URL signing versions, except version 3 (CSCtb99898).

2.4.5

2.5.11

Supported for all URL signing versions, except version 3 (CSCtb99898).

2.5.1 or 2.5.3

2.5.11

Supported for all URL signing versions.


SATA Disk Error Handling and Threshold Recommendations

This section addresses the concerns related to a recent increase in SATA disk failure frequency observed at customer production networks, which mostly occurred following a software upgrade from Release 2.5.3 to Release 2.5.9.

Configuration Recommendations

We recommend the following configuration settings for disk error handling:

(config)# disk error-handling threshold 100
(config)# no disk error-handling reload
(config)# service-router service-monitor threshold failcntdisk 4
 
   

The disk error-handling threshold command determines how many disk errors can be detected before the disk drive is automatically marked as bad. The disk error-handling threshold command range is 0-100 with a default value of 10. By default, this threshold is set to 10 disk-related read/write errors. A setting of 0 means the disk is never marked bad, but disk failure alarms are triggered frequently.

The default setting for the disk error-handling reload command is disabled.

The service-router service-monitor threshold failcntdisk command configures the disk failure count threshold value with a range of 1-15.

Changing the disk error-handling threshold command setting to 100, helps alleviate marking a good disk bad and prematurely offloading an SE because it reached the failed bad disk-count threshold. If you change this threshold to zero (0), the disk is never marked bad, but a disk failure alarm message occurs every time a disk error occurs. Setting the threshold to 100 is also beneficial by letting you know which drive has had errors, which could affect the end-user experience.

The service-router service-monitor threshold failcntdisk command sets the limit for how many disks with a CDNFS partition can fail or be marked bad before the Service Router no longer sends requests to the Service Engine. We recommend setting this threshold value to four; this means a third of the drives would have to fail before the device is considered not able to handle incoming requests or sessions efficiently. However, a device should never have four drives that are bad at any one point in time.

Root Cause

The system default setting for the disk error-handling threshold command is 10.

Starting with Release 2.5.9 of the Cisco Internet Streamer CDS software, a new alarm type, "badsector," was introduced to report specific bad sector errors. Release 2.5.3 did not have this alarm nor did it detect these badsector failures. In Release 2.5.9, after 10 sector-related I/O errors occur, a drive is marked as bad.

The following tasks are performed when a drive is marked bad:

Raise a disk_failure alarm (this alarm also exists in Release 2.5.3).

For Release 2.5.3, the disk_failure alarm is raised for any sector-related error.

For Release 2.5.9, the disk_failure alarm is only raised after 10 sector-related errors occur.

Forcibly unmount the drive.


Note In Release 2.5.9, additional retry attempts are made to unmount the drive. This was done in order to make the drive unmount logic more robust, especially during I/O streaming activity.


Intentionally invalidate the Master Boot Record (MBR) of the drive, thereby destroying any cached content. This is a new feature in Release 2.5.9 and was added to eliminate the possibility of reusing potentially corrupt cached content.

Release 2.5.9 introduces several new disk-related alarms, which might give the false impression after upgrading the software that the disk subsystem is not healthy. In reality, the software is merely reporting more accurate (finer grained) failures through the use of additional alarm types.


Note If a disk is marked bad, the show disk detail command output displays "disk01: Not used (*)" and the drive is not used after a reload.


In Release 2.5.9, the disk error-handling counter is incremented when the follow error occurs:

end_request: I/O error
 
   

In Release 2.5.3, the disk error-handling counter is incremented when the follow error occurs:

Buffer I/O error
                

The Disk Error Handling feature allows you to set the disk error-handling threshold and how to handle disk errors if the threshold is reached. If the automatic reload feature is enabled (the disk error-handling reload command), and the disk drive gets marked as bad because the disk error-handling threshold (read/write) was reached, the device is automatically reloaded. Following the device reload, a syslog message and an SNMP trap are generated. If the disk drive that is marked bad is a critical disk drive (by definition this is a disk with a SYSTEM partition), the redundancy of the system disks for this device is affected.

The disk error-handling reload is a legacy command that was used when RAID 1 was not implemented. Because RAID 1 is now being used, we do not want the device to be reloaded, because the software state may be lost upon reload. With the RAID system, if the critical primary disk fails, the other mirrored disk seamlessly continues operation.

A disk is marked bad when the number of read/write errors reaches the threshold setting of the disk error-handling threshold command, which is 10 by default. As an example, if there is one bad sector on a disk that gets read 10 times, the disk is marked bad. As another example, if there are 10 bad sectors that each get read once, the disk is marked bad.

However, one bad sector does not mean a drive is bad. Typically, the indication that a drive is bad and needs to be replaced is if the show disk SMART-info detail command output exceeds the values described in Table 6.

Table 6 Output Values of show disk SMART-info detail Command Indicating Disk Replacement

Field
CDNFS Drives—Threshold Raw Values
SYSTEM Drives—Threshold Raw Values

Reallocated_Sector_Ct raw_value

20

8

Current_Pending_Sector raw_value

15

7-10

Offline_Uncorrectable raw_value

15

7-10


A drive needs to be replaced if any of the RAW_VALUEs listed in Table 6 are exceeded. The values indicating drive replacement for SYSTEM drives (disk00 and disk01) are lower than CDNFS drives because of the critical nature of system drives as compared to data drives.

The show disk SMART-info command (without the detail keyword), provides information on the overall health of each drive. The following example of the show disk SMART-info command output shows that disk08 is bad:

# show disk SMART-info
 
   
      ... etc ...
 
   
=== disk08 ===
smartctl 5.40 2010-10-16 r3189 [i686-pc-linux-gnu] (local build)
Copyright (C) 2002-10 by Bruce Allen, http://smartmontools.sourceforge.net
 
   
=== START OF INFORMATION SECTION ===
Model Family:     Seagate Barracuda ES.2
Device Model:     ST3500320NS
Serial Number:    9QM92HZ0
Firmware Version: SN05
User Capacity:    500,107,862,016 bytes
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   8
ATA Standard is:  ATA-8-ACS revision 4
Local Time is:    Tue Jul 19 04:42:16 2011 PDT
 
   
==> WARNING: There are known problems with these drives,
see the following Seagate web pages:
http://seagate.custkb.com/seagate/crm/selfservice/search.jsp?DocId=207931
http://seagate.custkb.com/seagate/crm/selfservice/search.jsp?DocId=207963
 
   
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
 
   
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: FAILED!
Drive failure expected in less than 24 hours. SAVE ALL DATA.
Failed Attributes:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  5 Reallocated_Sector_Ct   0x0033   025   025   036    Pre-fail  Always   FAILING_NOW 1548
 
   

The show disk SMART-info command should be repeated for each drive. If the overall-health assessment of a drive indicates "FAILED," then the drive should be replaced. The output of the show disk SMART-info command also shows the SMART attributes that indicate drive failure (in the above example, the Reallocated_Sector_Ct attribute indicates FAILING_NOW).

Additionally, the CDSM GUI and the show alarms command on the SEs display the sector alarms. If sector alarms have occurred, enter the show disk SMART-info details command on the SE to determine the state of the drive and whether the drive needs to be replaced or repaired.

Following is an example of the show alarms command output in Release 2.5.9:

Minor Alarms:
-------------
     Alarm ID             Module/Submodule     Instance
     -------------------- -------------------- -------------------------
   1 badsector            sysmon               disk01
   2 badsector            sysmon               disk08
 
   

If the show disk SMART-info details command output values for Current_Pending_Sector and Offline_Uncorrectable are below the threshold described in Table 6, then you need to run the disk repair command. If the output values for Current_Pending_Sector and Offline_Uncorrectable are above the threshold described in Table 6, then you need to replace the disk. After the disk repair command completes, we recommend that you reboot the SE to ensure all CDS software services are functioning correctly.


Note In Release 2.5.9, there is a disk repair command similar to the repair-disk utility. The repair-disk utility provides progress indicators and displays a log of repaired sectors; it also provides more robust sector error detection, repair, and validation. Both the repair-disk utility and the disk repair command take approximately three hours to complete per disk.


Table 7 provides an example of the last part of the output of the show disk SMART-info detail command. The attributes that need to be reviewed to determine if the drive needs to be replaced or repaired are highlighted in bold. A drive needs to be replaced if any of the RAW_VALUEs listed in Table 6 are exceeded. In this example, because the Reallocated_Sector_Ct value is greater than 20, this drive should be replaced.

Table 7 RMA Case—Replace Drive Example 

ID#
ATTRIBUTE_NAME
FLAG
VALUE
WORST
THRESH
TYPE
UPDATED
RAW_VALUE

1

Raw_Read_Error_Rate

0x000f

072

063

044

Pre-fail

Always

59861501

3

Spin_Up_Time

0x0003

099

099

000

Pre-fail

Always

0

4

Start_Stop_Count

0x0032

100

100

020

Old_age

Always

12

5

Reallocated_Sector_Ct

0x0033

099

099

036

Pre-fail

Always

25

7

Seek_Error_Rate

0x000f

072

060

030

Pre-fail

Always

17169006

9

Power_On_Hours

0x0032

090

090

000

Old_age

Always

9010

10

Spin_Retry_Count

0x0013

100

100

097

Pre-fail

Always

0

12

Power_Cycle_Count

0x0032

100

037

020

Old_age

Always

12

184

Unknown_Attribute

0x0032

100

100

099

Old_age

Always

0

187

Reported_Uncorrect

0x0032

093

093

000

Old_age

Always

7

188

Unknown_Attribute

0x0032

100

100

000

Old_age

Always

0

189

High_Fly_Writes

0x003a

100

100

000

Old_age

Always

0

190

Airflow_Temperature_Cel

0x0022

071

069

045

Old_age

Always

29 (Lifetime Min/Max 28/29)

194

Temperature_Celsius

0x0022

029

040

000

Old_age

Always

29 (0 22 0 0)

195

Hardware_ECC_Recovered

0x001a

052

011

000

Old_age

Always

59861501

197

Current_Pending_Sector

0x0012

100

100

000

Old_age

Always

1

198

Offline_Uncorrectable

0x0010

100

100

000

Old_age

Offline

1

199

UDMA_CRC_Error_Count

0x003e

200

200

000

Old_age

Always

0


Table 8 provides an example of the last part of the output of the of the show disk SMART-info detail command. The attributes that need to be reviewed to determine if the drive needs to be replaced or repaired are highlighted in bold. In this example, the Current_Pending_Sector and Offline_Uncorrectable each have a value greater than one, so running the repair-disk utility will resolve this issue and this drive does not need to be replaced.

Table 8 Disk Repair Case—Repair Example

ID#
ATTRIBUTE_NAME
FLAG
VALUE
WORST
THRESH
TYPE
UPDATED
RAW_VALUE

1

Raw_Read_Error_Rate

0x000f

072

063

044

Pre-fail

Always

59861501

3

Spin_Up_Time

0x0003

099

099

000

Pre-fail

Always

0

4

Start_Stop_Count

0x0032

100

100

020

Old_age

Always

12

5

Reallocated_Sector_Ct

0x0033

099

099

036

Pre-fail

Always

5

7

Seek_Error_Rate

0x000f

072

060

030

Pre-fail

Always

17169006

9

Power_On_Hours

0x0032

090

090

000

Old_age

Always

9010

10

Spin_Retry_Count

0x0013

100

100

097

Pre-fail

Always

0

12

Power_Cycle_Count

0x0032

100

037

020

Old_age

Always

12

184

Unknown_Attribute

0x0032

100

100

099

Old_age

Always

0

187

Reported_Uncorrect

0x0032

093

093

000

Old_age

Always

0

188

Unknown_Attribute

0x0032

100

100

000

Old_age

Always

0

189

High_Fly_Writes

0x003a

100

100

000

Old_age

Always

0

190

Airflow_Temperature_Cel

0x0022

071

069

045

Old_age

Always

29 (Lifetime Min/Max 28/29)

194

Temperature_Celsius

0x0022

029

040

000

Old_age

Always

29 (0 22 0 0)

195

Hardware_ECC_Recovered

0x001a

052

011

000

Old_age

Always

59861501

197

Current_Pending_Sector

0x0012

100

100

000

Old_age

Always

3

198

Offline_Uncorrectable

0x0010

100

100

000

Old_age

Offline

3

199

UDMA_CRC_Error_Count

0x003e

200

200

000

Old_age

Always

0



Note We recommend always repairing the disk if either Current_Pending_Sector or Offline_Uncorrectable are greater than one. This is because sector errors tend to be spatially and temporally (in time) adjacent to each other on a drive.

The show disk SMART-info detail command only reports sector errors that have been detected; there may be more sectors in error adjacent to the reported bad sector. Repairing the drive also proactively repairs unreported sector errors. However, because repairing a drive is a time-consuming process, it may be easier to just replace the drive if a spare drive is available.


Table 9 provides detailed description of the Attribute Names that could indicate disk problems.

Table 9 Attribute Names Descriptions—Disk Problem Indicators 

ID
Attribute Name
Description

5

Reallocated Sectors Count

Count of reallocated sectors. When the hard drive finds a read/write/verification error, it marks that sector as "reallocated" and transfers data to a special reserved area (spare area). This process is also known as remapping, and reallocated sectors are called remaps. The raw value normally represents a count of the bad sectors that have been found and remapped; thus, the higher the attribute value, the more sectors the drive has had to reallocate. This allows a drive with bad sectors to continue operation; however, a drive that has had any reallocations at all is significantly more likely to fail in the near future. While primarily used as a metric of the life-expectancy of the drive, this number also affects performance. As the count of reallocated sectors increases, the read/write speed tends to worsen because the drive head is forced to seek to the reserved area whenever a remap is accessed. A workaround, which preserves drive speed at the expense of capacity, is to create a disk partition over the region that contains remaps and instruct the operating system to not use that partition.

If the drive can repair the sector without remapping it, then the Reallocated Sectors Count is not incremented. If the drive must remap the sector, the Reallocated Sectors Count is incremented.

197

Current Pending Sector Count

Count of "unstable" sectors (waiting to be remapped, because of read errors). If an unstable sector is subsequently read successfully, this value is decreased and the sector is not remapped. Read errors on a sector do not cause a remap of the sector, because the sector might be readable later. Instead, the drive firmware remembers that the sector needs to be remapped, and remaps it the next time it is written.

Running the repair-disk utility resolves these counts.

198

Uncorrectable Sector Count or

Offline Uncorrectable or

Off-Line Scan Uncorrectable Sector Count

The total count of uncorrectable errors when reading/writing a sector. A rise in the value of this attribute indicates defects of the disk surface, problems in the mechanical subsystem, or both.

Running the repair-disk utility resolves these counts.


Documentation Updates

The following documents have been added for this release:

Release Notes for Cisco Internet Streamer CDS 2.5.11

Related Documentation

Refer to the following documents for additional information about the Cisco Internet Streamer CDS 2.5:

Cisco Internet Streamer CDS 2.5 Software Configuration Guide

http://www.cisco.com/en/US/docs/video/cds/cda/is/2_5/configuration_guide/is_cds25-cfguide.html

Cisco Internet Streamer CDS 2.4-2.5 Quick Start Guide

http://www.cisco.com/en/US/docs/video/cds/cda/is/2_4/quick_guide/ISCDSQuickStart.html

Cisco Internet Streamer CDS 2.4-2.5 API Guide

http://www.cisco.com/en/US/docs/video/cds/cda/is/2_4/developer_guide/is_cds_24_apiguide.html

Cisco Internet Streamer CDS 2.5 Command Reference Guide

http://www.cisco.com/en/US/docs/video/cds/cda/is/2_5/command_reference/Command_Ref.html

Cisco Internet Streamer CDS 2.5 Alarms and Error Messages Guide

http://www.cisco.com/en/US/docs/video/cds/cda/is/2_5/message_guide/Messages.html

Cisco Content Delivery System 2.x Documentation Roadmap

http://www.cisco.com/en/US/docs/video/cds/overview/CDS_Roadmap.html

Cisco Content Delivery Engine 205/220/420 Hardware Installation Guide

http://www.cisco.com/en/US/docs/video/cds/cde/cde205_220_420/installation/guide/cde205_220_420_hig.html

Cisco Content Delivery Engine 100/200/300/400 Hardware Installation Guide

http://www.cisco.com/en/US/docs/video/cds/cde/installation/guide/CDE_Install_Book.html

Regulatory Compliance and Safety Information for Cisco Content Delivery Engines

http://www.cisco.com/en/US/docs/video/cds/cde/regulatory/compliance/CDE_RCSI.html

Open Source Used in CDS IS 2.5.11

http://www.cisco.com/en/US/docs/video/cds/cda/is/2_5/third_party/open_source/Open_Source_Used_in_CDS_IS_2.5.11.pdf

The entire CDS software documentation suite is available on Cisco.com at:

http://www.cisco.com/en/US/products/ps7127/tsd_products_support_series_home.html

The entire CDS hardware documentation suite is available on Cisco.com at:

http://www.cisco.com/en/US/products/ps7126/tsd_products_support_series_home.html

Obtaining Documentation and Submitting a Service Request

For information on obtaining documentation, submitting a service request, and gathering additional information, see the monthly What's New in Cisco Product Documentation, which also lists all new and revised Cisco technical documentation, at:

http://www.cisco.com/en/US/docs/general/whatsnew/whatsnew.html

Subscribe to the What's New in Cisco Product Documentation as a Really Simple Syndication (RSS) feed and set content to be delivered directly to your desktop using a reader application. The RSS feeds are a free service and Cisco currently supports RSS version 2.0.

This document is to be used in conjunction with the documents listed in the "Related Documentation" section.