Table Of Contents
Using the Cisco RPMS Diagnostic Tools
Overview: Using the Diagnostic Tools
Overview: Diagnostics Report Fields
Configuration Parameters of Diagnostics Subsystem
Customer-Related Diagnostics Reports
Customer Counts
Calculating the Customer Limit Deficit
Customer Rejects
Customer Call Durations
Gateway-Related Diagnostics Reports
UGCallDuration
UGAccountingDelay
UGRejects
UGTerminations
UGRetries
System-Related Diagnostics Information
PolicyUsage
Message Rate
System Activity
SystemLatency
SystemStats
SystemUGRetries
CellDurations
MissingAcctStop
CallFailureRates
Threshold Report
Provisioning Changes
Overview: Diagnosing End User Issues
Oversubscribed Policies
Using the Cisco RPMS Diagnostics Tool to Monitor Customer Activity
Low Call Durations
Using the Cisco RPMS Diagnostics Tool to Troubleshoot Low Call Durations
Overview: Identifying Policy Enforcement Issues
Incorrect Active Call Count
Using the Cisco RPMS Diagnostics Tool to Determine the Active Call Count
Overview: Identifying Universal Gateway Issues
Using the Cisco RPMS Diagnostics Tool to Determine Universal Gateway Issues
Overview: Tracking Provisioning Changes
Overview: Viewing the Cisco RPMS History
Using the Cisco RPMS Diagnostics Tool to View the System History
Overview: Analyzing Traffic Patterns with Cisco RPMS
Collecting System Statistics
Analyzing System-Wide Traffic
Overview: Analyzing Customer-Specific Traffic
Call Count Over Time
Call Duration Distribution Over Time
Using the Cisco RPMS Diagnostic Tools
Topics in this chapter include:
•
Overview: Using the Diagnostic Tools
•
Overview: Diagnostics Report Fields
•
Overview: Diagnosing End User Issues
–
Oversubscribed Policies
–
Low Call Durations
•
Overview: Identifying Policy Enforcement Issues
–
Incorrect Active Call Count
•
Overview: Identifying Universal Gateway Issues
•
Overview: Tracking Provisioning Changes
•
Overview: Viewing the Cisco RPMS History
•
Overview: Analyzing Traffic Patterns with Cisco RPMS
–
Collecting System Statistics
–
Analyzing System-Wide Traffic
•
Overview: Analyzing Customer-Specific Traffic
–
Call Count Over Time
–
Call Duration Distribution Over Time
Overview: Using the Diagnostic Tools
Cisco RPMS Release 2.0.1 introduces new diagnostic features to help monitor the system's performance. The diagnostic tools have two functions: to gather information so that you can monitor the Cisco RPMS' performance, and to help diagnose system issues caused by network elements that interact with the Cisco RPMS. Both functions help track the Cisco RPMS' performance to ensure it runs smoothly, and should problems occur, help you to diagnose and repair the issues.
The features have two components:
•
A set of reports generated as log files
•
A CLI interface using the crpms_diag command to access the reports
By using the diagnostic tools, you can accomplish the following:
•
Troubleshoot issues that adversely affect end users, such as receiving busy signals or low call duration
•
Monitor other issues, such as policy enforcement or UG hardware and software problems
•
Track provisioning changes so that you can clearly identify changes to the Cisco RPMS configuration
•
Identify issues over a specific period of time
•
Analyze traffic patterns
For more information on using the diagnostic tools, refer to the Cisco Resource Policy Management System Release 2.0.1 Solutions Guide.
Overview: Diagnostics Report Fields
This section defines the fields for all the diagnostics reports generated by Cisco RPMS. The diagnostics reports are classified in to the following categories.
•
Customer-Related Diagnostics Reports
•
Gateway-Related Diagnostics Reports
•
System-Related Diagnostics Reports
•
Provisioning Changes
The diagnostics feature is enabled by default.
Configuration Parameters of Diagnostics Subsystem
The following example shows the configuration parameters for the Diagnostics Subsystem of Cisco RPMS 2.0.1. The configuration parameters are located in the <$RPMSBASE>/config/rpms.conf file.
The sampling intervals are in milliseconds.
Example 7-1 Configuration parameters for the Diagnostics Subsystem
BaseDir= Directory used to store diagnostics file.
CustCountsPolicyUsageDiagSampleTime=120000
CustRejectsDiagSampleTime=300000
CustCallDurationDiagSampleTime=1200000
UGCallDurationDiagSampleTime=1200000
UGAcctDelayDiagSampleTime=300000
UGRejectsDiagSampleTime=300000
UGTerminationsDiagSampleTime=1200000
UGRetriesDiagSampleTime=1200000
UGCallDurationSysDiagSampleTime=1200000
UGCallFailureSysDiagSampleTime=300000
UGMissingAcctStopsDiagSampleTime=3600000
UGRetriesSysDiagSampleTime=1200000
SystemStatsDiagSampleTime=1800000
SystemActivityDiagSampleTime=300000
SystemLatencyDiagSampleTime=300000
SystemMessageRateDiagSampleTime=300000
Customer-Related Diagnostics Reports
Cisco RPMS logs customer-related diagnostics information in three different report files:
•
Customer Counts
•
Customer Rejects
•
Customer Call Durations
The reports are generated for each customer configured in Cisco RPMS under a specific directory for that customer. The reports reside in the following directory structure:
$RPMSBASE/diagnostics/<Date>/customer/<CustomerName>/<ReportType>_<TimeStamp>.csv
•
Report Type is the name of the report.
•
Timestamp is the timestamp for the particular day.
The data in the files are logged at periodic intervals that are configurable (in the Diagnostics section of rpms.conf). All data is incremental (i.e.. change in value since last sample time).
The following sections describe the report fields.
Customer Counts
The report logs the active call count information of each customer during the sampling interval.
Table 7-1 Customer Count Fields
Field
|
Definition
|
Timestamp
|
Time at which the data was sampled.
|
Session Limit/ Base Limit
|
Number of guaranteed concurrent sessions allowed for a customer.
|
Oversubscription Limit
|
Maximum number of guaranteed sessions allowed after the Session/Base limit is reached for a customer.
|
Shared Overflow Limit
|
Maximum number of shared overflow sessions allowed after the base and Oversubscription limits have been reached for a customer.
|
Current Count
|
Number of calls currently active for this customer.
|
Current Limit Deficit
|
The value that the limit should currently be raised by in order to accept all calls.
|
Calculating the Customer Limit Deficit
The active call count for an SLA is the total number of active calls that have been admitted by meeting SLA criteria. When the active call limit reaches an SLA limit, subsequent call attempts are rejected until one of the active calls disconnect, allowing a new call to enter the system and returning to the limit.
A "virtual" call count is identical to the active call count before its limits are reached. However, when a call request is rejected by an SLA active call limit, the active call count does not change; with a virtual call count, the limit is incremented by 1. Essentially, the virtual call count tracks the count as if there were no limits. This difference between the virtual and active call counts is called the "limit deficit."
Periods of limit restriction can last for hours, yet average dial hold times last about 20 minutes. Because of this, the virtual call count also uses a virtual call duration, so that a fictitious additional call count expires after a statistical amount of time. This prevents rejected call counts from inflating forecast virtual calls.
A simple queue-based approach is used to calculate the limit deficit. The queue tracks the call duration of the "virtual" calls. Figure 7-1 illustrates the queue-based algorithm.
Figure 7-1 The queue-based algorithm
Figure 7-1 shows the following:
•
Ts = Sampling Interval
•
To = Start Time of Sampling Timer
•
Ti = Time at ith Sample, i=1,2,3...
•
n = Queue Size = Virtual Call Duration / Ts
•
Ri = Number of rejected/virtual calls in the ith sample, i=1,2,3...
The limit deficit at any time is the sum of all the elements in the queue, so that:
To better understand this algorithm, use the following example.
Example 7-2 The Customer Limit Deficit Algorithm
Virtual Call Duration = 20 minutes
Sampling Interval, Ts = 2 minutes
Start Time of Sampling Timer, To = 04:00:00
From the values in Example 7-2, n = 20 / 2 = 10. Since n is the queue size, the values of virtual call duration and Ts is chosen so that n is a non-fractional number greater than or equal to 1.
Figure 7-2 shows the queue contents at various points in time.
Figure 7-2 Queue contents After 6 Samples
Figure 7-3 Queue contents After 10 Samples
Figure 7-4 Queue contents After 11 Samples
Table 7-2 shows the limit deficit for various samples.
Table 7-2 Limit Deficit Table
Sample
|
Time
|
Limit Deficit
|
1
|
04:02:00
|
2
|
2
|
04:04:00
|
7 (2+5)
|
3
|
04:06:00
|
7 (2+5+0)
|
4
|
04:08:00
|
8 (2+5+0+1)
|
5
|
04:10:00
|
15 (2+5+0+1+7)
|
6
|
04:12:00
|
26 (2+5+0+1+7+11)
|
7
|
04:14:00
|
50 (2+5+0+1+7+11+24)
|
8
|
04:16:00
|
62 (2+5+0+1+7+11+24+12)
|
9
|
04:18:00
|
66 (2+5+0+1+7+11+24+12+4)
|
10
|
04:20:00
|
71 (2+5+0+1+7+11+24+12+4+5)
|
11
|
04:22:00
|
75 (5+0+1+7+11+24+12+4+5+6)
|
..
|
...
|
...
|
Until the 10th sample, the limit deficit is merely a summation of the queue elements. However, for the 11th sample, the queue has already reached its limit. This indicates that the duration of "virtual" calls at the head of the queue have reached the virtual call duration of 20 minutes.
This is why you must calculate the queue size by dividing the virtual call duration by Ts. So, the first element is popped before the new sample value of 6 (which is the number of rejections between 04:20:00 and 04:22:00) is pushed into the queue.
Customer Rejects
The report logs rejection details of each customer during the specified sampling interval.
The report has only Limit Rejections and VPDN Rejections.
Table 7-3 Customer Rejects Fields
Field
|
Definition
|
Timestamp
|
Time at which the data was sampled.
|
Attempted Call-Accepts/ Attempted Calls
|
Number of attempted calls since last sample.
|
Limit Rejections
|
Number of limit rejections since the last sample.
|
Limit Rejection %
|
(Limit Rejections/Attempted Calls) x 100.
|
VPDN Attempted Calls
|
Number of attempted VPDN sessions since the last sample.
|
VPDN Rejections
|
Number of VPDN rejections since the last sample.
|
VPDN Rejection %
|
(VPDN Rejections/VPDN Attempted Calls) x 100.
|
Customer Call Durations
This report logs the duration of calls for each customer during the sampling interval.
Table 7-4 Customer Call Duration Fields
Field
|
Definition
|
Timestamp
|
Time at which the data was sampled.
|
Closed Calls
|
Number of calls closed since the last sample.
|
%(a-b] min
|
Number of calls closed between "a" (exclusive) and "b" (inclusive) minutes expressed as a percentage of total Closed Calls.
|
Gateway-Related Diagnostics Reports
Cisco RPMS logs gateway-related diagnostics information in the following report files:
•
UGCallDuration
•
UGAccountingDelay
•
UGRejects
•
UGTerminations
•
UGRetries
Cisco RPMS logs gateway-related diagnostics reports in five different reports. These reports are generated for each gateway configured in the Cisco RPMS. The reports are located in the following directory structure:
$RPMSBASE/diagnostics/<Date>/gateway/<UGIP>/<ReportType>_<TimeStamp>.csv
•
Report Type is the name of the report.
•
Timestamp is the timestamp for the particular day.
The data in the files are logged at periodic intervals that are configurable in the [Diagnostics] section of rpms.conf. All data is incremental (i.e., change in value since last sample time).
UGCallDuration
This report logs the duration of the calls for each UG during the sampling interval.
Table 7-5 UG Call Duration Fields
Field
|
Definition
|
Timestamp
|
Time at which the data was sampled.
|
Closed Calls
|
Number of calls closed since the last sample.
|
%[a-b] min
|
Percentage of closed calls with call duration between a (exclusive) and b (inclusive) minutes.
|
UGAccountingDelay
The report logs information regarding delayed accounting messages that Cisco RPMS received during the sampling interval.
Table 7-6 UG Accounting Delay Fields
Field
|
Definition
|
Timestamp
|
Time at which the data was sampled.
|
Accounting Starts
|
Number of resource allocated messages since the last sample.
|
%[a-b] sec
|
Percentage of resource allocated messages with a delay between a (exclusive) and b (inclusive) seconds. The delay for a resource allocated message is the time expired since the corresponding pre-auth message.
|
UGRejects
The report logs the rejections per UG during the sampling interval.
Table 7-7 UG Reject Fields
Field
|
Definition
|
Timestamp
|
Time at which the data was sampled.
|
Attempted Calls
|
Number of attempted calls since the last sample.
|
Limit Rejections
|
Number of limit rejections since the last sample.
|
Limit Rejection %
|
(Limit Rejections/Attempted Calls) x 100.
|
VPDN Attempted Calls
|
Number of attempted VPDN sessions since the last sample.
|
VPDN Rejections
|
Number of VPDN rejections since the last sample.
|
VPDN Rejection %
|
VPDN Rejections/VPDN Attempted Calls) x 100.
|
UGTerminations
The report logs information regarding the UG Terminations during the sampling interval.
Table 7-8 UG Termination Fields
Field
|
Definition
|
Timestamp
|
Time at which the data was sampled.
|
Attempted Calls
|
Number of attempted calls since the last sample.
|
Terminations
|
Number of calls terminated since the last sample.
|
Termination %
|
(Terminations/Attempted Calls) x 100.
|
<Terminate Cause> Terminations
|
Number of terminations due to a terminate cause listed below. There is one such column for every terminate cause listed below.
|
<Terminate Cause> Termination %
|
(Terminate Cause Terminations/Attempted Calls) x 100.
|
The following list is for all UG (1-18) and Cisco RPMS (19-23) generated terminate causes:
1.
User Request
2.
Lost Carrier
3.
Lost Service
4.
Idle Timeout
5.
Session Timeout
6.
Admin Reset
7.
Admin Reboot
8.
Port Error
9.
NAS Error
10.
NAS Request
11.
NAS Reboot
12.
Port Unneeded
13.
Port Preempted
14.
Port Suspended
15.
Service Unavailable
16.
Callback
17.
User Error
18.
Host Request
19.
Another call on the same UG and port
20.
Dangling Session
21.
UG unreachable
22.
No terminate reason from UG
23.
Modem Terminate=<reason>
24.
Other.
UGRetries
The report contains information UGRetries during the sampling interval.
Table 7-9 UGRetries Fields
Field
|
Definition
|
Timestamp
|
Time at which the data was sampled.
|
Total Requests
|
Total number of messages, including retries, of following types Pre-Auth, Accounting Start, Accounting Update, Accounting Stop, VPDN Session, VPDN Connect and VPDN Disconnect.
|
Retries
|
Number of message retries for above message types.
|
Retries %
|
(Retries/Total Requests) x 100.
|
System-Related Diagnostics Information
Cisco RPMS logs system-related diagnostics information in the following report files:
•
PolicyUsage
•
Message Rate
•
System Activity
•
SystemLatency
•
SystemStats
•
SystemUGRetries
•
CallDurations
•
MissingAcctStop
•
CallFailureRates
•
Threshold Report
Cisco RPMS generates system-related diagnostics reports to help you understand how the system is being utilized. The reports also provide diagnostics information at the highest level. This information helps troubleshoot system issues.
Cisco RPMS logs the system-related diagnostic information in seven reports, located in the following directory structure:
$RPMSBASE/diagnostics/<Date>/system/<ReportType>_<TimeStamp>.csv
•
"Report Type" is the name of the report.
•
"Timestamp" is the timestamp for the particular day.
The data in the files are logged at periodic intervals configurable in the [Diagnostics] section of rpms.conf. All data is incremental (i.e., changes in value since last sample time).
PolicyUsage
The report logs the usage of the SLAs of each customer during the sampling interval.
Table 7-10 PolicyUsage Fields
Field
|
Definition
|
Customer
|
Name of the Customer.
|
Guaranteed Limit Used %
|
Percentage of guaranteed concurrent sessions used by the customer.
|
Guaranteed Limit
|
Number of guaranteed concurrent sessions allowed for a customer.
|
Shared Overflow Limit Used %
|
Percentage of Shared overflows session used by the customer.
|
Shared Overflow Limit
|
Maximum number of shared overflow sessions allowed after the base and Oversubscription limits have been reached for a customer.
|
Current Count
|
Number of calls currently active for this customer.
|
Current Limit Deficit %
|
Percentage of the Current Limit Deficit limit to the actual Limit Current Limit Deficit. The value that the limit should currently be raised by, in order to accept all calls.
|
Peak Limit Deficit %
|
Percentage of the Peal Limit Deficit to the actual Peak Limit.
|
Peak Limit Deficit .
|
The value that the limit should currently be raised by, in order to accept all calls.
|

Note
Cisco RPMS allows the SLA count to be configured as No Limit.
2147483647 corresponds to "No Limit" configured for Session Count Limit.
4294967294 corresponds to "No Limit" configured for both Session Count Limit and Oversubscription limit.
Message Rate
The report shows the different types of messages Cisco RPMS has received during the sampling interval.
Table 7-11 Message Rate Fields
Field
|
Definition
|
Timestamp
|
Time at which the sample was collected.
|
Table title
|
Possible Values for this field are Total Input,call-reservation-request,vpdn-tunnel- reservation,call-start,call-stop,call-stop-accept,call-stop-reject, administrative-cdr-log-request.
|
Total entries
|
Total number of messages received within the sampling interval.
|
%[a-b] milliseconds
|
Percentage of distribution of messages of similar type and nature with a delay between a (inclusive) and b (inclusive) milliseconds. The delay for the message is the time elapsed between the previous messages of the same type.
|
Note
For the Total Input row, the interpretation of %[a-b] milliseconds is the percentage of the distribution of the subsequent message delay during the sample interval.
Call Stop Accept and Call Stop Reject are the messages for future use.
System Activity
The reports logs the active call count of Cisco RPMS during the sampling interval.
Table 7-12 System Activity Fields
Field
|
Definition
|
Timestamp
|
Time at which the sample was collected.
|
Call Count
|
Current active call count at that sampling interval.
|
SystemLatency
This reports logs the latency for each messages Cisco RPMS has received during the sampling interval.
Table 7-13 System Latency Fields
Field
|
Definition
|
Timestamp
|
Time at which the sample was collected.
|
Table title
|
Possible Values for this field are Totals,call-reservation-request+accept-call- reserved,call-reservation-request+accept- call-overflow,call-reservation-request+ reject-call,vpdn-tunnel-reservation+ accept-tunnel,vpdn-tunnel-reservation+ reject-tunnel,vpdn-tunnel-reservation+ no-vpdn-info-found,call-start+ack,c,ll-stop+ack,call-stop-accept+ack,call-stop-reject+ack.
|
Total entries
|
Total number of messages processed within the sampling interval.
|
%[a-b]milliseconds
|
Percentage of distribution of messages of similar type with a delay between a (inclusive) and b (inclusive) milliseconds. The delay is the time taken to process that particular type of message.
|
Note
For Total Input row, the interpretation of %[a-b] milliseconds is the percentage of the distribution of the subsequent message delay during the sample interval. The delay is the time taken to process the message.
SystemStats
The report logs the system related Statistics during the sampling interval.
Table 7-14 SystemStat Fields
Field
|
Definition
|
Timestamp
|
Time at which the data was sampled.
|
Statistics Name
|
Name of the Statistics.
|
Today
|
Number of packets received by Cisco RPMS so far today.
|
Since Restart
|
Number of packets received by RPMS since last restart.
|
SystemUGRetries
The report logs the UG Retries information during the sampling interval.
Table 7-15 SystemUGRetries Fields
Field
|
Definition
|
Timestamp
|
Time at which the data was sampled.
|
UG Name
|
Name of the UG.
|
IP
|
IP address of the UG.
|
Retries %
|
(Total Retries/Total Requests) x 100.
|
Total Retries
|
Number of message retries for following message types:
• Pre-Auth
• Accounting Start
• Accounting Update
• Accounting Stop
• VPDN Session
• VPDN Connect
• VPDN Disconnect
|
Total Requests
|
Total number of messages, including retries, of following message types:
• Pre-Auth
• Accounting Start
• Accounting Update
• Accounting Stop
• VPDN Session
• VPDN Connect
• VPDN Disconnect
|
CellDurations
Cisco RPMS logs call duration related information in this report. The report is generated whenever calls are closed by NAS in short time. Low call durations can be caused due to NAS issues, such as faulty hardware.
Table 7-16 CallDurations Fields
Field
|
Definition
|
UG NAME
|
Name of the UG.
|
UG IP
|
IP address of the UG.
|
Total
|
Total calls received from the UG.
|
%[a-b]minutes
|
Percent of closed calls with call duration between a (inclusive) and b (inclusive) minutes.
|
MissingAcctStop
Cisco RPMS generates this report whenever NASs fail to send accounting stop to Cisco RPMS or when the accounting stop message is lost. This leads to a disparity between the active call count reported by RPMS and the number of real calls. If this happens, Cisco RPMS continues to count the calls as active even if they are dropped on the NAS. These calls are cleared only from Cisco RPMS by replacement with another call on the same port or after Active Call Timeout.
Table 7-17 MissingAcctStop Fields
Field
|
Definition
|
UGName
|
Name of the UG.
|
IP
|
IP Address of the UG.
|
Missing Acct-Stop%
|
(Calls w/o Stop)/(TotalAttemptedCalls) * 100.
|
Total Attempted Calls
|
Total no of Calls received from the UG.
|
Calls w/o Stop
|
Total no of Calls received without AcctStop.
|
CallFailureRates
The report is generated whenever NASs experience hardware and or software problems while sending calls to Cisco RPMS. This report shows all NASs that had calls terminated due to reasons deemed erroneous. Cisco RPMS considers the following call termination codes as erroneous:
Table 7-18 Erroneous Termination Codes
Termination Code
|
Definition
|
2
|
Lost Carrier
|
3
|
Lost Service
|
9
|
NAS Error
|
13
|
Port Preempted
|
14
|
Port Suspended
|
15
|
Service Unavailable
|
The following are the fields and definitions for CallFailureRates.
Table 7-19 CallFailureRates Fields
Field
|
Definition
|
UG Name
|
Name of the UG.
|
UG IP
|
UG IP Address.
|
Call FailureRate
|
(Total Attempted Calls)/(Failed Calls) * 100.
|
Total Attempted Call
|
Total calls received from the UG.
|
Failed Calls .
|
Number of calls terminated with the erroneous termination code mentioned in the above table.
|
Threshold Report
The report is generated when Cisco RPMS used resources reach its Threshold limits. This report is generated in the following cases:
•
When RPMS File Descriptor's is greater than 1000.
•
When CPU-IDLE time percentage of the Host is less than 30%.
•
When TCP Connection Lost is greater than 0.
•
When System free memory percentage is less than 1%.
This report is generated whenever these conditions become true. The report displays the respective Warning Message which occurred in the system. The report also has a complete output of the process running in the system at that instance. In addition to the report, it also has statistical information on Cisco RPMS-related processes, and their usage.
Table 7-20 MissingAcctStop Fields
Field
|
Definition
|
TIME OF DAY
|
Time stamp at which the report is generated.
|
usr
|
Percentage of total CPU used in user mode.
|
sys
|
Percentage of total CPU used in system mode.
|
wio
|
Percentage of total CPU used for I/O.
|
idle
|
Percentage of total CPU unused.
|
free
|
Free physical memory in KB.
|
free%
|
Percentage of free physical memory in KB.
|
in
|
Page in requests rate.
|
out
|
Page out requests rate.
|
scan
|
Paging scan rate.
|
used
|
KB of used swap.
|
free
|
KB of free swap.
|
dropped
|
Number of dropped TCP connections.
|
npid
|
Number of Oracle process.
|
cpu
|
Percentage of CPU used.
|
priv
|
KB of total memory used in private mode by the processes.
|
cpu
|
Percentage of CPU used.
|
siz
|
KB of memory used.
|
res
|
KB of memory used.
|
cpu
|
Percentage of CPU used.
|
siz
|
KB of memory used.
|
res
|
KB of memory used.
|
fdd
|
Number of file descriptor currently opened.
|
cpu
|
Percentage of CPU used.
|
siz
|
KB of memory used.
|
res
|
KB of memory used.
|
Note
Single space is the delimiter for this report.
Provisioning Changes
Cisco RPMS tracks all provisioning changes made to the system database by logging all changes to a "Change Log" file. The file records the time, type and details of every change.
The change log file is generated in the $RPMSBASE/diagnostics/<Date>/ directory.The file is updated whenever there is a change to the system database.
Table 7-21 Provisioning Change Fields
Field
|
Definition
|
Time
|
Time at which the change was made to the database.
|
Action
|
Signifies the configuration change created, deleted, or modified the specified entity (e.g., CREATE, DELETE, UPDATE).
|
Entity
|
The type of configuration object modified.(e.g. Customer, DNIS Group, Trunk Group, NAS List, etc).
|
Description
|
Details on the configuration change made to the system.
|
Overview: Diagnosing End User Issues
You can use the Cisco RPMS diagnostic tools to diagnose common end user problems such as receiving busy signals, no connection, or if connected, having their calls end quickly. Reasons why these issues may occur include:
•
Oversubscribed policies
•
Low call durations
Oversubscribed Policies
End users may encounter busy signals due to oversubscribed policies. Oversubscription is a configurable Cisco RPMS feature, and when limits are exceeded, the system should send them busy signals. But end users may not like receiving the busy signals; to keep the signals to a minimum, the diagnostic tools allow you, the Cisco RPMS administrator, to monitor customer activity. You can access exactly which customers have exceeded the limits and are receiving busy signals, and which other customers are quickly approaching their limits and sending rejections. This way, you can track and re-configure extra or overflow ports so that end users do not receive busy signals.
Using the Cisco RPMS Diagnostics Tool to Monitor Customer Activity
To monitor the current status of customer activity, execute the crpms_diag CLI command:
The command's output provides a snapshot of call activity for all customer profiles in the system, similar to this:
Data Updated at: Jul 31, 2002 3:36 PM, PDT
"Customer","Limit Used(%)","Limit","Current Count","Current Limit
Deficit(%)","Current Limit Deficit","Peak Limt Deficit(%)","Peak Limit
Deficit"
customer1,100,1010,1010,20,202,30,303
customer2,0,110,0,0,0,0,0
The customers at or near their limits appear at the top of the list.
Other output provided by the command includes:
•
Current Limit Deficit—This field displays the amount you should have raised the limit to accept all calls during the past 20 minutes.
•
Peak Limit Deficit—This field shows the amount you should have raised the limit to accept all calls at any time of the day.
You can use these numbers to change Customer Profile configurations.
Low Call Durations
Another problem end users may encounter is low call durations; the users might not receive busy signals, but their calls may end quickly or may not connect at all.
This is a UG-related issue, caused by the UG hardware. In a large network, it might be difficult to identify problematic UGs causing low call durations.
Using the Cisco RPMS Diagnostics Tool to Troubleshoot Low Call Durations
To display errors, execute the crpms_diag err CLI command:
The command's output provides a snapshot of errors in the system, similar to this:
2002-Jul-28:Call Durations. Details in
/export/home/crpms/diagnostics/2002-Jul-28/system/CallDurations_2002-J
ul-28_00-00-07.csv
In this example, the "RPMS Client Errors" message displays the UGs sending traffic to Cisco RPMS that are experiencing low call durations. It also displays the file which contains further details about the problematic UGs.
Overview: Identifying Policy Enforcement Issues
This section helps identify policy enforcement issues that may occur with Cisco RPMS. The main issue that may occur is an incorrect call count.
Incorrect Active Call Count
Cisco RPMS may report more active calls in the system than actual calls in the network. The disparity between the active call count reported by Cisco RPMS and the number of real calls is caused by missing or delayed accounting messages. Depending on the Cisco IOS release running, accounting messages may occasionally be delayed or dropped.
Note
For information on the compatible Cisco IOS releases, refer to the Cisco RPMS 2.0.1 Release Notes.
If this occurs, Cisco RPMS marks the calls as active even if they are dropped on the UG. Cisco RPMS only clears the calls when another call replaces the dropped call on the same port.
Using the Cisco RPMS Diagnostics Tool to Determine the Active Call Count
To display errors, execute the crpms_diag err CLI command:
The command's output provides a snapshot of errors in the system, similar to this:
2002-Jul-26:Missing Accounting Stops. Details in
/export/home/crpms/diagnostics/2002-Jul-26/system/MissingAcctStop_2002
-Jul-26_00-01-44.csv
In this example, the "RPMS Client Errors" message details that UGs sending traffic to the Cisco RPMS have not been sending accounting stop messages. It prints the name of the file containing additional information about the problematic UGs.
The file shows data in the following format:
Data Updated at: Jul 27, 2002 12:00 AM, PDT
"NasName","IP","Missing Acct-Stop(%)","Total Attempted Calls","Calls
w/o Stop",
NAS-1,10.10.1.1,100,107,107
The data details the total calls received from the UG, and the percent of calls from that total which did not receive an accounting stop message. You can use these numbers to compare a low traffic UG with a higher traffic UG.
In the example above, the Cisco RPMS received 107 calls from NAS-1, and 100% of those calls missed the accounting-stop message, which means the UG is not sending the required messages to the Cisco RPMS. With this information, you can look into why the UG and ensure it is running a Cisco RPMS compatible release of the Cisco IOS software.
Overview: Identifying Universal Gateway Issues
Hardware or software problems on the UG can cause service interruptions for the Cisco RPMS. The problems may only affect a few of the UGs, and as such, may be hard to track.
However, to address this issue, Cisco RPMS tracks the call termination codes for the UGs. You can use the crpms_diag CLI command to view the information and to display errors.
Using the Cisco RPMS Diagnostics Tool to Determine Universal Gateway Issues
To display errors, execute the crpms_diag err CLI command:
The command's output provides a snapshot of errors in the system, similar to this:
2002-Jul-28:Call Failure Rates. Details in
/export/home/crpms/diagnostics/2002-Jul-28/system/CallFailureRates_200
2-Jul-28_00-00-07.csv
In this example, the "RPMS Client Errors" message displays information about the UGs sending traffic to Cisco RPMS that might be experiencing hardware and or software problems. It also displays the file which contains further details about the problematic UGs.
The file shows data in the following format:
Data Updated at: 16:13:40
"NAS NAME","NAS IP","Call Failure Rate","Total Attempted
Calls","Failed Calls"
NAS-1,10.10.1.1,10,100,10
The data in this example displays all UGs with calls terminated because of errors. In Cisco RPMS, the following call termination codes are considered erroneous.
Table 7-22
Termination Code
|
Definition
|
2
|
Lost carrier.
|
3
|
Lost Service
|
9
|
UG error.
|
13
|
Port preempted.
|
14
|
Port suspended.
|
15
|
Service unavailable.
|
Erroneous call termination codes
The sample output in the example displays that the Cisco RPMS received 100 calls from NAS-1 and that 10% of those calls terminated because of errors. There may be a problem, and you should investigate the indicated UG.
Once the UG is identified, look for a breakdown of all terminated calls by termination code for this UG in the <crpms-home>/diagnostics/<date>/nas/ <nas-ip>/ NasTerminations_<timestamp>.csv file.
Overview: Tracking Provisioning Changes
Cisco RPMS tracks all of the provisioning changes you make to the system database. The changes, such as the time, type and details, are logged to a change Log file.
The change log file is generated in the <crpms-home>/diagnostics/<date>/ directory. It displays data in the following format:
Time Action Entity Description
2002-Jul-23 4:41 PM CREATE Customer Name=cust1, Description=,
Call Treatment=busy, Non Overflow Limit=100, Overflow Limit=10, Non
Overflow Threshold=85, Call Reject Threshold=85, Overflow Call
Rejection Threshold=10
2002-Jul-23 4:42 PM CREATE Customer-Mapping-Criteria
Customer=cust1, Dnis Group=default, Call Type=digital, SS7
Resource=None, Trunk Group=default
Overview: Viewing the Cisco RPMS History
The CLI command crpms_diag, as described in previous sections, supports a history option. You can use the history option to display information collected over a certain period of time, and to obtain a historical perspective of the Cisco RPMS system.
Using the Cisco RPMS Diagnostics Tool to View the System History
To display the history option, execute the crpms_diag err hist CLI command. In the following example, you could execute the history option to see any UG errors that occurred in the past seven days:
Overview: Analyzing Traffic Patterns with Cisco RPMS
You can use the Cisco RPMS diagnostic tools to analyze traffic patterns and collect information useful in planning for capacity or other issues. The information you can collect includes:
•
System statistics
•
System-wide traffic
Collecting System Statistics
You can use the CLI command crpms_diag to collect high-level statistics about your Cisco RPMS system. To do so, execute the command as follows:
The command output should look similar to:
Data Updated at: Jul 24, 2002 2:01 PM, PDT
Statistics Name,Today,Since Restart
ResourceAllocated,65422,265832
ResourceFreed,65356,265356
ResourceUpdate,805,265805
The previous example displays information such as the various types of packets received by Cisco RPMS so far that day, and since the last restart occurred. Cisco RPMS also records the information in a file so that you can retrieve statistics for a particular day from the file generated for that day. The file is generated in the following location:
<crpms-home>/diagnostics/<date>/system/SystemStats_<time>.csv
Analyzing System-Wide Traffic
Cisco RPMS periodically records the total number of active calls in the system. The information is stored in the <crpms-home>/diagnostics/<date>/system/ SystemActivity_<time>.csv file, which is generated daily.
The entries logged in this file are comma separated, so it is easy to import the file into a charting application such as Microsoft Excel, and to plot the active calls over time to get an idea about overall traffic patterns during the day. You can also analyze patterns over a period longer than a day by concatenating multiple files before charting them.
The data recorded in the SystemActivity files looks similar to this:
Overview: Analyzing Customer-Specific Traffic
With the Cisco RPMS diagnostic tools you can also analyze customer-specific traffic. The information you can collect includes:
•
Call count over time
•
Call duration distribution over time
Call Count Over Time
By using the Cisco RPMS' diagnostic tools, you can profile a customer's traffic pattern over time. Cisco RPMS periodically records all of the active calls for each customer profile in the system. The information is generated daily, and then stored in the <crpms-home>/diagnostics/<date>/customer/<customer-name>/ CustomerCounts_<time>.csv file.
The entries logged in this file are comma separated, so it is easy to import the file into a charting application such as Microsoft Excel, and to plot the active calls over time to get an idea about overall traffic patterns for a customer during the day. You can also analyze patterns over a period longer than a day by concatenating multiple files before charting them.
The data recorded in the CustomerCounts files looks similar to this:
Customer, Cust-1, Counts for Jul-23, 2002
"Timestamp","Session Limit","Oversubscription Limit","Current
Count","Current Limit Deficit"
Plotting the Session Limit and the Call Count shows the time of day that a customer profile went into oversubscription mode. It also displays any provisioning changes made to the Session Limit for this profile.
Call Duration Distribution Over Time
Cisco RPMS also periodically records the duration of terminated calls for each customer. The information is sorted into a number of time bins, generated daily, and stored in the <crpms-home>/diagnostics/<date>/customer/<customer-name> /CustomerCallDuration_<time>.csv file.
The entries logged in this file are comma separated, so it is easy to import the file into a charting application such as Microsoft Excel, and to plot the active calls over time to get an idea about overall traffic patterns for a customer during the day. You can also analyze patterns over a period longer than a day by concatenating multiple files before charting them.
The data recorded in the CustomerCallDuration files looks similar to:
Customer, Cust-1, Call Duration Distribution for Jul-23, 2002
"Timestamp","Closed
Calls",%(0-1]min,%(1-2]min,%(2-4]min,%(4-8]min,%(8-16]min,%(16-32]min,
%(32-64]min,%(64-128]min,%128+min
16:13:46,100,0,0,0,80,0,20,0,0,0
16:13:56,10,0,0,0,90,0,0,10,0,0
16:14:06,0,0,0,0,0,0,0,0,0,0
You can plot all of the bins if you want. The graph for that would show that end users for the sample customer profile (Cust-1) stay online longer in the morning and in late evenings.