The Unified Intelligence Center cluster is a group of independent nodes, each of which has a
replicated database that is kept in sync with the other nodes in the cluster.
With Unified Intelligence Center Release 8.0.2 (and later), you can distribute the Unified Intelligence Center
cluster over the WAN. When participating in a cluster, either over a LAN or
WAN, the configuration objects created on one node automatically replicate
to other nodes. This chapter describes the sizing considerations made when considering clustering over the WAN.
In general, size the WAN so that it supports the customer-expected
performance characteristics. Consider the following network characteristics:
-
Bandwidth: is a measure of the amount of
data that can be sent per second.
-
Duplex: is a measure of how data can be
transmitted. Full duplex indicates traffic can be transmitted at full bandwidth
speed in both directions. Unified Intelligence Center requires a full duplex WAN connection.
-
Latency: is a measure of the time it takes
for a data packet to propagate from one side of the WAN to the other. For Unified Intelligence Center, this must be 100 ms or less, which leads to a Round Trip Time
(RTT) of 200 ms or less.
-
Jitter: is a measure of how much variance
there is with respect to latency. Cisco expects that network jitter is
minimized and that the latency accounts for all the jitter.
-
Loss: (sometimes referred to packet loss)
is a measure of how much data can be lost or dropped. This should be minimized
and it is expected that loss is at or near 0%. Anything else leads to
retransmissions and effective loss of bandwidth (as the same data is being
retransmitted).
-
Duplication: is a measure of how much data
can be duplicated. This should be minimized and it is expected that duplication
is at or near 0%. Anything else leads to retransmissions and effective loss
of bandwidth (as the same data is being retransmitted).
-
Corruption: is a measure of how much data
can be corrupted. This should be minimized and it is expected that corruption
is at or near 0%. Anything else leads to retransmissions and effective loss
of bandwidth (as the same data is being retransmitted).
As previously stated, each Unified Intelligence Center node is independent of every other node
and all database updates (that is, configuration data) are replicated as to all
other nodes. When the Unified Intelligence Center cluster is running over the WAN, it should be noted
that this replication utilizes the WAN connection and therefore is a function
of the network characteristics. As an object is created on one node, it will be
instantly available users on that node, but may take a few seconds before the
object can replicate to other nodes (see the following for time estimates based on
object size). The only objects that are replicated are configuration objects.
Configuration objects include the following:
-
Data Sources: encapsulates configuration data for
each data source object
-
Dashboard: encapsulates configuration data for
dashboard object
-
Report Definition Filter: encapsulates
configuration data for each report definition filter (Note: Running a report with an
initial or new filter causes replication of the filter object, but refreshing a
report has no impact on replication.)
-
Report Definitions: encapsulates report
definition configuration information
-
Reports: encapsulates a report object, which
points to a report definition and corresponding views
-
Report Types: encapsulates configuration
information for the type of report (grid, chart, gauge)
-
Views: encapsulates column names,
groupings, hidden fields
-
Users: encapsulates a user
-
Categories: encapsulates configuration information
for categorization of reports, report definitions, and dashboards
-
Value Lists: encapsulates configuration information
for the value list
-
Collections: encapsulates configuration
information for a collection that includes its values
-
DataSets (from Scheduled Reports):
encapsulates the data set for a scheduled report. All dataset information (report
data not configuration data) is replicated to all subscriber nodes
In general, Cisco expects that most users configure the Unified Intelligence Center
system with a set of commonly accessed reports and run these reports
periodically. In addition, Cisco expects that users create new
objects (for example, reports), but this should occur less frequently. Any
configuration object that is modified or created needs to be replicated and
will be replicated as quickly as possible, but since we are expecting users to
be modifying or creating objects less frequently than executing or accessing an
object, this should minimize the need for large amounts of WAN bandwidth.
Because every node requires a connection to every other node,
replication is estimated by assuming that each node receives a portion
or channel of the WAN bandwidth, which is a function of the number of nodes in
Site A and the number of nodes in Site B. Or more specifically, the WAN bandwidth requirements differ depending on how the cluster is
setup.
For more information, see
Chapter
4: Bandwidth and Performance Recommendations.
Organizing Sites
Because the maximum size of a cluster supported by Unified Intelligence Center is eight nodes, it
is impossible to have a fully redundant clustering solution unless each site is
at most four nodes. As qualified, each Unified Intelligence Center supports up to 200 users given the
Reporting Standard Profile defined in Chapter 4.
In addition, only the Unified Intelligence Center Primary node
can provide the following services:
The Primary node must be located at the
customer's primary site (as these services will be unavailable during
a failover situation). Furthermore, it is crucial that you back up the Primary node periodically in either a LAN or WAN environment.
Failures
Each Unified Intelligence Center node buffers replication data to send to other nodes in the
cluster. When communication is lost with other nodes in the cluster (or a node
fails) then the data is queued until contact with the other node or nodes
is restored. Each node continues to work independently even during
connectivity failure--it just does not have access to objects created or
modified on other nodes. Even though the queue is quite large (1600 MB), it is
not unlimited, therefore, fills up the queue during
prolonged failure. As the buffer fills and starts to reach capacity, an alarm
is sent (CiscoAlarm30) notifying administrators of the potential buffer
exhaustion condition. If connectivity is restored before the buffer is filled,
then it synchronizes at a rate proportional to the amount of data in the
buffer and the connection bandwidth.
If connectivity is not restored before the buffer fills, then
replication is reset. Resetting replication allows the node to
continue running reports and working independently. If the node is a secondary
node then it requires full synchronization with the primary node (primary
database backup and restore on secondary node) when connectivity with the primary
node is restored. If replication is reset, then anything created/modified on the
secondary node is rolled-back to the state of the primary database. If the
primary node fails, you have to re-install and revert to a saved
backup. Make sure that you back up the primary node periodically so that no
data is lost. It is very important in WAN environments to
periodically back up the Primary node. Prolonged WAN outage, failure of primary
node, and failure to perform recommended backups could result in a cluster
that must be reinstalled or else there could be data lost.
If connectivity is restored before the buffer fills, then all data
is automatically replicated. Depending upon the connection between the
nodes and the amount of data that has accumulated, it may take some time for
the nodes to fully synchronize. At any time, administrative users can
view the status replication (including the number of bytes in the replication
queue) using the CLI command:
utils dbreplication runtimestate.
For more information, see the CLI documentation.
Configuring WAN
After you install and configure all systems, restart each
system. This allows clustering to synchronize and to merge on
network partitions. This only needs to be done the first time after configuring a
new node.