Guest

Cisco WAN Manager

Statistics Collection Design and Sizing

Document ID: 29006



Contents

Introduction
Prerequisites
      Requirements
      Components Used
      Conventions
Sizing
Dual Collection
Examples of Setup
Scenarios of Collectors Redundancy
Scenarios of Cisco WAN Manager Servers Redundancy
Scenarios of Data Redundancy
Performances
NetPro Discussion Forums - Featured Conversations
Related Information

Introduction

This document provides information on how to deploy a statistic collection setup, which depends on the network to monitor.

Prerequisites

Requirements

To understand this document well, refer to the document Statistic Collection Architecture before you begin.

Components Used

The information in this document is based on these software and hardware versions:

  • Cisco WAN Manager 11.0

  • Cisco WAN Manager 10.5

The information in this document was created from the devices in a specific lab environment. All of the devices used in this document started with a cleared (default) configuration. If your network is live, make sure that you understand the potential impact of any command.

Conventions

For more information on document conventions, refer to the Cisco Technical Tips Conventions.

Sizing

One statistics collector can sustain the collection of about 200 to 250 nodes. The collector can be the backup of another Statistics Collection Manager (SCM) collector, which is secondary to another set of nodes. In this case, the statistics collector receives the assignment of 150 nodes per set of nodes to manage. In the worst case, the collector has to handle a maximum of 300 nodes at the same time. Users must make sure that the scmctrlsvr and scmcollsvr configuration files reflect the correct number of nodes, MAX_NODES. For more information, refer to the document Statistic Collection Configuration Files.

There are no recommendations about the number of statistics to collect. The Cisco WAN Manager database size depends on the network size and the number of objects that the user wants to collect. You must adjust the database size if the database fills up too quickly. The live network requires some tests. These parameters can influence the speed at which the database grows:

  • Bucket size

  • Collection interval

    Note: Currently, the collection interval hard code is for collection on AXSM to 15 minutes.

  • Number of statistics that are enabled

Cisco WAN Manager gives you the possibility to purge the database of data that is older than a certain number of hours. By default, the number of hours is 24. Issue the onstat -d command to check the database space utilization.

The release notes provide the server hardware requirement information for standalone SCMs. The requirements for the Cisco WAN Manager servers depend on the number of connections. The connections are in the user_connection table.

Dual Collection

Dual collection means that two different collectors collect the same statistics file on one switch. In version 10.5.10 patch 2 and later, every type of node can have dual collection. In versions earlier than 10.5.10 patch 2, there is only support for dual collection in legacy nodes, such as Cisco IGX/BPX, MGX 8220, and MGX 8850/8250 that runs software version 1. If you want to use dual collections, to have two different Cisco WAN Manager servers collect files from the same switch, be sure that:

  • The two servers are part of the same Cisco WAN Manager domain through the Cisco WAN Manager gateway.

  • You have set MAX_PARALLEL_COLLSVRS in scmctrlsvr.conf to 2 on both servers.

caution Caution: Dual collection doubles the collection traffic and can hit network performances to a certain extent.

Examples of Setup

Here are some examples of Cisco WAN Manager deployment for statistics collection:

Servers

Description

Maximum Number of Nodes

1 Cisco WAN Manager server

This is the minimum setup. There is no redundancy. The embedded collector serves as the basis of the collection.

200 nodes can be collected

1 Cisco WAN Manager server with 1 SCMSA1 server

The standalone statistics collector reports to the Cisco WAN Manager server.

300 nodes with collection redundancy, 400 nodes without collection redundancy

1 Cisco WAN Manager server with 3 SCMSA servers

This setup allows you to increase the number of nodes that are collected with full collection redundancy.

600 nodes with collection redundancy

2 Cisco WAN Manager servers with 1 SCMSA server each

This setup allows collection and data redundancy.

300 nodes

1 SCMSA = SCM Standalone

cwm-stats-design.gif

When a customer wants to do statistics collection with full redundancy, the minimum setup to consider requires two main Cisco WAN Manager servers with their embedded collectors and two standalone collectors. This setup can overcome most of the possible failures.

  • Master—The primary Cisco WAN Manager server

  • Slave—The secondary Cisco WAN Manager server

  • Collector A—Embedded collector

  • Collector B—Standalone for the Master

  • Collector C and Collector D—Collectors that report to the server Slave

Collector A is the primary collector for a maximum of 150 nodes, which is node set 1. Collector B is the secondary collector for those 150 nodes. You can create a new logical set of nodes with a maximum of 150 nodes. Collector B becomes the primary collector for this new set, and Collector A becomes the secondary collector. If the network gets larger than 300 nodes, you must add a new couple of standalone collectors that report to the Master. There is no software limit for the number of collectors that you can deploy on this setup. Those collectors can only report to the Master.

The Slave acts as a standby machine in the case of a Master failure. The Slave can also implement dual collection. The standalone collectors that report to the Master can serve as backup for the Slave in the case of a Master failure. This ability reduces the number of machines.

Another scenario is to have two different Cisco WAN Manager servers with different collection responsibilities. For example, server A collects files from BPXs, and server B takes care of feeders.

Scenarios of Collectors Redundancy

If Collector A fails or loses connectivity to the network, Collector B, the secondary, takes over. Here, the redundancy of network access paths becomes crucial. If you think that a two-level collection redundancy is not enough, you can add a tertiary collector.

Scenarios of Cisco WAN Manager Servers Redundancy

What if the Master fails? With the current architecture, the good thing is that, if the Master goes down, the standalone collectors carry on the collection. This situation introduces two problems.

The first problem is that users lose control of the collection. The second problem is the threat of a loss of a statistics file if the parsers are on the main servers. The solution to these two problems is to fix the server as soon as possible. Even if the collectors lose contact with the Master, the collectors continue to try to transfer statistics files to the main server via FTP. The collectors retry three times and wait 5 minutes between each retry before they give up. So, if the main server does not come back up after 15 minutes, the server loses data files. However, if you have chosen the Save Files option, those files that the server misses remain on the collector server under the purge directory. Version 10.5.10, with the new distributed statsparser feature, eliminates this problem. Refer to the document Statistics Collection Architecture.

In order to recover the control on the collection when the Master is dead, you must switch on the secondary Cisco WAN Manager server, Slave. With the gateway feature, the Slave becomes the primary server. However, the Slave does not have information about whether the collectors collect. The only information the Slave has is the enable statistics list for every node. Collector A and Collector B only know about the Master. The information is in the process.conf files of the collectors. If you do not have Collector C and Collector D, you can start collection from the Slave on Collector A and Collector B. In order to start collection, shut down these collectors. Then, use the cnfcollsvr script to modify the process.conf files in order to make the collectors aware of the Slave. At this time, if you restart these collectors, you resume collections and try again to transfer files via FTP to the Master. This process occurs because that collection information is in the control database, scmdb. The only way to stop that collection is to issue a coldstart on the standalone SCMs.

Version 10.5.10 helps the collection switchover to the secondary server with the new scmproxy script. The script allows you to perform the whole process without the need to interact with the GUI. For more information on this script, refer to the document How to Use scmproxy Script.

A second improvement with version 11.0.l distributes the knowledge of what each collector does. The switchover procedure does not require a coldstart of the standalone SCMs; the secondary Cisco WAN Manager server, the Slave, is able to notice that Collector A and Collector B have active collections.

The number of history files to collect can help you avoid the loss of any statistics files after a switchover. In version 9.2, the SCM GUI configuration dictates the number of history files Cisco WAN Manager collects. In version 10.4 and later, you tune this parameter in scmcollsvr.conf with the variable HISTORY_FILES. The recommendation for this value is between 0 and 3.

You must test a Cisco WAN Manager server switchover scenario to ensure the success of a procedure that you have tuned for a particular setup.

Scenarios of Data Redundancy

You achieve data redundancy with the data that you have saved in the Stratacom database and the File Save option. (With version 10.5.10, the data is in the statsdb.) System administrators can implement an Informix DB backup procedure. If only the statistics files are of concern to you, back up these files on a different machine for safety.

You can also achieve data redundancy with dual collection. But this method introduces an overhead of management traffic and an additional cost of servers.

Performances

Multiple collectors and statistics parsers have enhanced the Cisco WAN Manager 10.5 and 11.0 architecture. These processes are also multithreaded for better usage of CPU cycles and I/O resources.

In versions earlier than 10.5.10, the bottleneck is the incoming directory. In these versions, you can offload the collection to the remote standalone collector, but the parsing remains on the main Cisco WAN Manager server.

In version 10.5.10 and later, you can also offload the parser to remote standalone machines that run local statsparsers. This solution is the most scalable when you face large networks. In this example, Collector B and Collector D can have their own statsparsers. The only traffic that goes between the Master, the Slave, and the collectors is control data.

Another criterion for performances is the connectivity to the nodes. For large networks, use multiple collection paths to the network. Multiple collection paths prevent an overload of the gateway nodes. Out-of-band networks also provide an improvement, if the quality of the management subnet is good.

NetPro Discussion Forums - Featured Conversations

Networking Professionals Connection is a forum for networking professionals to share questions, suggestions, and information about networking solutions, products, and technologies. The featured links are some of the most recent conversations available in this technology.
NetPro Discussion Forums - Featured Conversations for Network Management
Network Infrastructure: Network Management
Virtual Private Networks: Network and Policy Management

Related Information



Updated: Jan 31, 2006Document ID: 29006