Available Languages

Download Options

PDF (4.9 MB)
View with Adobe Reader on a variety of devices

Updated:October 31, 2019

Bias-Free Language

The documentation set for this product strives to use bias-free language. For the purposes of this documentation set, bias-free is defined as language that does not imply discrimination based on age, disability, gender, racial identity, ethnic identity, sexual orientation, socioeconomic status, and intersectionality. Exceptions may be present in the documentation due to language that is hardcoded in the user interfaces of the product software, language used based on RFP documentation, or language that is used by a referenced third-party product. Learn more about how Cisco is using Inclusive Language.

Cisco Crosswork Situation Manager 7.3.x Implementer Guide

Contents 1

Implementer Guide 1

Plan Your Implementation 1

Cisco Crosswork Situation Manager 7.3.0 Supported Environments 1

Sizing Recommendations 5

Server Roles 7

Scale Your Cisco Crosswork Situation Manager Implementation 10

High Availability Overview 11

Distributed HA architectures 11

Resilience and failover 11

High Availability Configuration Hierarchy 12

HA Reference Architecture 12

HA Architecture for LAMs 15

High Availability for Third Party Component Dependencies 17

HA Control utility command reference 18

Install Cisco Crosswork Situation Manager 19

Deployment options 20

Pre-installation for Cisco Crosswork Situation Manager v7.3.x 20

RPM installation 30

Tarball installation 32

Distributed HA Installation 35

Validate the installation 61

System Setup for Cisco Crosswork Situation Manager 63

System Configuration 64

Configure Data Ingestion 81

Configure Data Processing 91

Control Cisco Crosswork Situation Manager Processes 159

Encrypt Database Communications 161

Apply Valid SSL Certificates 162

Moog Encryptor 163

Configure External Authentication 164

Configure the Message Bus 185

Configure Search and Indexing 196

Configure Logging 199

Configure Services to Restart 208

Configure ToolRunner 211

Configure SMS Notifications 216

Enable Situation Room Plugins 218

Import a Topology 219

Configure Historic Data Retention 220

Probable Root Cause 229

URL-based Filters 230

Archive Situations and Alerts 237

Change passwords for default users 240

Upgrade Cisco Crosswork Situation Manager 241

RPM upgrade to Cisco Crosswork Situation Manager v7.3.x 242

Tarball upgrade to Cisco Crosswork Situation Manager v7.3.x 242

Minimize upgrade downtime 243

RPM - Prepare to upgrade 243

RPM - Upgrade UI components 244

RPM - Upgrade Core components 248

RPM - Upgrade database components 251

RPM - Upgrade data ingestion components 255

RPM - Migrate from MySQL to Percona 258

Tarball - Prepare to upgrade 262

Tarball - Upgrade UI components 263

Tarball - Upgrade Core components 265

Tarball - Upgrade database components 268

Tarball - Upgrade data ingestion components 271

Tarball - Migrate from MySQL to Percona 272

Table Compression Utility 277

Configuration Migration Utility 280

Finalize and validate the upgrade 280

Post-upgrade steps 287

Uninstall Cisco Crosswork Situation Manager 288

Stop Core Cisco Crosswork Situation Manager and Supporting Services 289

Uninstall Core Cisco Crosswork Situation Manager Packages and Remove Directories and Users 289

Uninstall Supporting Applications 289

Uninstall Remaining Packages and Remove Yum Repositories 291

Monitor and Troubleshoot Cisco Crosswork Situation Manager 292

Monitor Component Performance 293

Monitor Component CPU and Memory Usage 295

Monitor Database 298

Monitor Graze API 300

Monitor Moogfarmd Health Logs 300

Monitor Moogfarmd Data Processing Performance 301

Monitor RabbitMQ Message Bus Performance 303

Monitor System Performance Metrics 303

Monitor Tomcat Servlet Logs 304

Troubleshoot Installation and Upgrade 305

Troubleshoot Integrations Controller 308

Troubleshoot Mobile 308

Troubleshoot Percona 309

Troubleshoot Processes 309

Troubleshoot Required Services for a Functional Production System 314

Troubleshoot Slow Alert/Situation Creation 315

Troubleshoot Slow UI 317

Obtaining Documentation and Submitting a Service Request 322

Cisco Trademark 323

Cisco Copyright 323

Implementer Guide

The Implementer Guide contains instructions to help you install and configure Cisco Crosswork Situation Manager.

To install the system and handle common post-installation setup, see Install Cisco Crosswork Situation Manager and System Setup for Cisco Crosswork Situation Manager.

After you have the base system up and running, you can begin to ingest event data from your monitoring sources. Integrations covers most integrations topics. You can find some detail on some common configuration tasks for data ingestion under Configure Data Ingestion.

Much of the value of Cisco Crosswork Situation Manager comes from its ability to process your raw event data, deduplicate the events, and transform the data into alerts that comprise Situations. It is critical to configure the system to create meaningful Situations for you and to present the Situations to the right teams. Figuring out your needs for your Situation design will help you make decisions about the right data processing choices for you.

Based upon your Situation design choices and the type of data available from your monitoring sources, you can follow the Clustering Algorithm Guide to choose the correct clustering algorithms for your system. Then you have several options to Configure Data Processing to achieve your goals. See also the Administrator Guide.

To keep your system running and healthy, see Monitor and Troubleshoot Cisco Crosswork Situation Manager.

Plan Your Implementation

Cisco Crosswork Situation Manager 7.3.0 Supported Environments

The following operation systems, browsers and third-party software are either supported or are required in order to run Cisco Crosswork Situation Manager.

Any operating systems and browsers not listed in the sections below are not officially recommended or supported.

Operating Systems

You can run Cisco Crosswork Situation Manager on the following versions of Red Hat Enterprise Linux®(RHEL) and CentOS Linux:

OS	Versions
CentOS	v7
RHEL	v7

Note: No other Linux distributions are currently supported

Browsers

You can use the following browsers for the Cisco Crosswork Situation Manager UI:

	Version
Chrome	Latest
Firefox	Latest
Safari	Latest
Edge	Latest
IE	v11

Supported Third-Party Software

The latest default installation of Cisco Crosswork Situation Manager comes with the following third-party applications:

Application	Version
Apache Tomcat®	v9.0.22
Elasticsearch	v6.8.1 (LTS version)
Percona	v5.7.26
Nginx	v1.14.0 or above
RabbitMQ	v3.7.4

Other supported application packages include:

Application	Version
Erlang	v20.1.7
JDK	OpenJDK 11.0.2.7-0.el7_6
Apache Tomcat® Native	v1.2.23 or above

Integration Support

The following table outlines the vendor supported integrations for the current version of Cisco Crosswork Situation Manager alongside the corresponding supported software versions.

Integrations support IPv6 connectivity.

Integration Version	Supported Software / Version
Ansible Tower Integration v1.10	Ansible Tower v3.0, 3.1
Apache Kafka Integration v1.12	Apache Kafka v0.9, 1.1, 2.2
AppDynamics Integration v2.2	AppDynamics v4.0, 4.1
AWS CloudWatch Integration v2.0	aws-java-sdk v1.11
AWS SNS Integration v1.2	AWS SNS v2016-06-28
BMC Remedy Integration v1.8	Remedy v9.1
CA UIM Integration v1.8	CA Nimsoft UIM v8.4
CA Spectrum Integration v2.2	CA Spectrum v10.2
Catchpoint Integration v1.0	Catchpoint v2019
Cherwell Service Management Integration v1.5	Cherwell v9.3
Datadog Polling Integration v1.3	Datadog v2018
Datadog Webhook Integration v1.11	Datadog v5.21
Dynatrace APM Plugin Integration v1.8	Dynatrace v6.5, 7.0
Dynatrace APM Polling Integration v2.2	Dynatrace v6.5, 7.0
Dynatrace Notification Integration v1.5	Dynatrace v6.5
Dynatrace Synthetic Integration v1.12	Dynatrace Synthetic v2017
Email Integration v2.5	IMAP, IMAPS, POP3, POP3S
EMC Smarts Integration v1.3	RabbitMQ v3.7.4 and Smarts v9.5
ExtraHop Integration v1.2	ExtraHop v2018
FluentD Integration v1.10	FluentD v0.12
Grafana Integration v1.2	Grafana v5.2.4
HP NNMi Integration v2.5	HP NNMi v10.2
HP OMi Plugin Integration v1.8	HP OMi v10.1
HP OMi Polling Integration v2.5	HP OMi v10.1
JIRA Service Desk Integration v1.10	JIRA Service Desk v7.6
JIRA Software Integration v1.10	JIRA Software v7, JIRA Cloud
JMS Integration v1.11	ActiveMQ v5.14, JBoss v10, WebLogic v12.0
Microsoft Azure Integration v1.2	Microsoft Azure Monitor v2018
Microsoft Azure Classic Integration v1.2	Microsoft Azure Classic v2018
Microsoft SCOM Integration v2.6	Microsoft SCOM v2012, 2016
Microsoft Teams Integration v1.0	Microsoft Teams v1.2.00.3961
Nagios Integration v2.10	Nagios vXI
New Relic Integration v1.10	New Relic v2016
New Relic Polling Integration v2.0	New Relic v2.3
New Relic Insights Polling Integration v1.0	New Relic v2.3
Node.js Integration v1.9	Node.js v1.6
NodeRED Integration v1.9	Nagios Red v016, 017
OEM Integration v2.3	Oracle Enterprise Manager v12c, 13c
Office 365 Email Integration v1.0
Pingdom Integration v1.9	Pingdom v2017
Sensu Integration v1.0	Sensu Core v1.8
ServiceNow Integration v4.3	ServiceNow vNew York, Madrid, London, Kingston
SevOne Integration v1.5	SevOne v5.7.2.0
Site24x7 Integration v1.0	Site24x7 v17.4.3, 17.4.4
Slack Integration v1.7	Slack v3.1
SolarWinds Integration v3.2	SolarWinds v11.5, 12.2
Splunk Integration v2.5	Splunk v6.5, 6.6, 7.0
Splunk Streaming Integration v1.0	Splunk v7.2, 7.3
Sumo Logic Integration v1.1	Sumo Logic v2018
VMware vCenter Integration v2.3	VMware vCenter v6.0, 6.5
VMware vROps Integration v2.3	VMware vROps v6.6
VMware vSphere Integration v2.4	VMware vSphere v6.0, 6.5
VMware vRealize Log Insight Integration v2.4	VMware vRealize Log Insight v4.3
WebSphere MQ Integration v1.12	WebSphere MQ v8
xMatters Integration v1.6	xMatters v5.5
Zabbix Integration v1.0	Zabbix v3.4
Zabbix Polling Integration v3.4	Zabbix v3.2
Zenoss Integration v2.4	Zenoss v4.2

Sizing Recommendations

The sizing recommendations below are guidelines for small, medium and large Cisco Crosswork Situation Manager systems based on input data rate and volume.

In the context of this guide, Managed Devices (MDs) are all of the components in the network infrastructure that generate and emit events:

Small

Environment

CPU

File System

1000 to 5000 Managed Devices (MDs)

Less than 20 users

Up to 5 integrations

Less than 20 Alerts per second

8 Cores

32GB RAM

2 x 1GB Ethernet

Physical or Virtual Server

1 TB Local or SAN

See retention policy.

Medium

Environment

CPU

File System

5000 to 20,000 MDs

Between 20 and 40 users

Between 6 and 10 integrations

Between 20 and 100 Alerts per second

16 Cores

64GB RAM

2 x 1GB Ethernet

Physical or Virtual Server

1 TB Local or SAN

Seeretention policy.

Large

Environment

CPU

File System

More than 20,000 MDs

More than 40 users

More than 10 integrations

More than100 Alerts per second

24+ Cores

128GB RAM

2 x 1GB Ethernet

Physical or Virtual Server

1 TB Local or SAN

Seeretention policy.

Virtualization Restrictions

Consider the following restrictions for virtual environments:

· Ideally all Moog servers (guests) should be on the same compute node (host) sharing a hypervisor or virtual machine monitor. This minimizes latency between Moog guests.

· If servers are liable to automated resource balancing (e.g. vMotion) and liable to move compute nodes, then all Moog servers should be moved at the same time. If this is not possible, then Moog servers should be constrained to movements that minimize the resulting network distance.

· If Moog servers are distributed amongst compute nodes then the network “distance” (logical hops) between the nodes should be minimized.

· Network latency between components may affect Event processing throughput. This is especially true of the core to db servers.

Shared Storage

On any shared compute platform Cisco makes the following recommendations:

· The minimum resource requirements are multiplied by at least 33% to account for shared resource usage and allocation.

· Storage latency will reduce effective throughput at the core processing layer and should be minimised within the available constraints of a SAN.

· Cisco Crosswork Situation Manager should be treated as a highly transactional system and not placed on the same compute node as other highly transactional applications that may cause SAN resource contention.

· SAN port and array port contention should be minimized

· Storage medium should be as fast as possible to minimize the transaction times to the database.

Retention Policy

You can calculate the amount of disk space in GB required for the database server using the following calculation:

(es x eps x d x 86,400) x 1.2 / 1,000,000

For this calculation: es = average event size in KB, eps = average events per second, d = number of days of retention and 86,400 represents the number of seconds per day.

For the majority of event sources, you can reasonably estimate a 2KB event size. However, some sources have larger than average events. For example, Microsoft SCOM. A 2KB base takes account of the other event and alert based storage such as an alert's Situation membership and Situation room thread sizes.

The average event rate is across all LAMs and integrations.

Note:

If you do not enable the Archiver tool, the historic database will grow indefinitely. See Archive Situations and Alerts for more information.

For example, the following calculation represents a 400 day retention period with an average event size of 2KB at 300 events per second:

(2 x 300 x 400 x 86,400) x 1.2 / 1,000,000 = 24,883.2 GB.

Server Roles

In order to plan your Cisco Crosswork Situation Manager deployment, it helps to understand the different components of Moogsoft AIOps and the options for distributing them among multiple physical or virtual machines.

A server role within an Cisco Crosswork Situation Manager installation is a functional entity containing components that must be installed on the same machine. You can distribute different roles to different machines.

The following diagram illustrates the typical deployment strategy for the components of Cisco Crosswork Situation Manager in an Highly Available configuration:

Related image, diagram or screenshot

The architecture is built upon two clusters with software components that serve several roles. See also HA Reference Architecture.

In the case of a single-server installation, you install all the roles on one machine.

UI role

The UI role comprises Nginx and Apache Tomcat, represented in the diagram as numbers 1 and 2. The Cisco Crosswork Situation Manager servlets groups run in active / active configuration.

Ngnix is the proxy for the web application server and for integrations.

Tomcat is the web application server. It reads and writes to the Message Bus and the database.

Database role

Percona XtraDB Cluster serves the database role, represented in the diagram as numbers 3, 4, and 5. The cluster runs in active / active standby / active standby mode.

Percona Xtra Db Cluster is the system datastore that handles transactional data from other parts of the system: LAMs (integrations), data processing, and the web application server.

HA Proxy handles database query routing and load balancing.

See /document/preview/120574#UUID816c7d74d05ed359780616a54d06a4d4 for more information.Database Strategy

Core role

The Core role, represented by numbers 6 and 7 in the diagram comprises the following:

· Moogfarmd, the Cisco Crosswork Situation Manager data processing component. Moogfarmd consumes messages from the Message Bus. It processes event data in a series of servlet-like modules called Moolets.

· Moogfarmd reads and writes to the database and publishes messages to the bus.

· RabbitMQ which provides the message queue. It receives published messages from integrations. It publishes messages destined for data processing (Moogfarmd) and the web application server.

· Elasticsearch which provides the UI search capability. It indexes documents from the indexer Moolet in the data processing series. It returns search results to Tomcat.

In HA deployments, Moogfarmd automatically runs in active / passive mode. See #section5d02a6f594ecfidm45764278084720 for more information.

In concert with the the Redundancy Role server, RabbitMQ and Elasticsearch run in active / active / active mode.

Redundancy role

The redundancy role, represented by number 8 in the diagram, provides the third node required for true HA for RabbitMQ and Elasticsearch.

Data ingestion role

Link Access Modules (LAMs) make up the data ingestion role represented by numbers 9 and 10 in the diagram. Receiving LAMs listen for events from monitoring sources and Polling LAMs poll monitoring sources for events. Both parse and encode raw events into discrete events, and then write the discrete events to the Message Bus.

In HA deployments, receiving LAMs run in active / active mode, but polling LAMs run in active / passive mode.

Load balancers

The load balancers in front of the UI server role and the data ingestion server role are the customer's responsibility.

Scale Your Cisco Crosswork Situation Manager Implementation

Cisco Crosswork Situation Manager supports several options to help you scale your implementation to meet your performance needs. Monitor and Troubleshoot Cisco Crosswork Situation Manager to monitor your system for signs that it is time to scale.

For information on the performance tuning capabilities of individual Cisco Crosswork Situation Manager components, see Monitor Component Performance.

Horizontal Scaling

Cisco Crosswork Situation Manager currently supports horizontal scaling at the integration (LAM) and visualization (Ngnix + Tomcat) layers.

· You can add more LAMs, either on additional servers or on the same server, to achieve higher event rates. In this case, you have the option to configure event sources to send to the parallel LAMs separately or to implement a load balancer in front of the LAMs.

· You can add Nginx/Tomcat UI "stacks" behind a load balancer to increase performance for UI users. Adding UI stacks does not always provide better performance. It can degrade performance by adding more connection pressure to the database.

The following are typical horizontal scaling scenarios:

· You can add an additional LAM to process incoming events if you see that, despite attempts to tune the number of threads for an individual LAM, its event rate hits a plateau. This is a sign that the LAM is the bottleneck, so adding other instances of the LAM behind a load balancer will allow a higher event processing rate.

· You can add an additional UI stack if database pool diagnostics for Tomcat suggest that all or most of the database connections are constantly busy with long running connections, but the database itself is performing fine.

The data processing layer (moogfarmd) is not currently well suited to horizontal scaling. Moolets of the same type cannot currently share processing. Adding more Moolets like the AlertBuilder in an attempt to increase the event processing rate is likely to lead to database problems.

Vertical Scaling

All Cisco Crosswork Situation Manager components ultimately benefit from being run on the best available hardware, but the data processing layer (moogfrarmd) benefits most from this approach. Depending on the number and complexity of Moolets in your configuration, you will see performance benefits in data processing on servers having the fastest CPUs with numerous cores and a large amount of memory. This enables you to increase the number of threads for moogfarmd to improve processing speed. You should also locate the database on the most feasibly powerful server (clock speed, number of cores and memory) with the biggest/fastest disk.

Distributed Installations

In some cases you distribute Cisco Crosswork Situation Manager components among different hosts to gain performance because it reduces resource contention on a single server: The most common distribution is to install the database on a separate server, ideally within the same fast network to minimize risk of latency. An additional benefit of this move is that it allows you to run a clustered or master/slave database for redundancy.

Another common distribution is to install the UI stack (Nnginx) on a separate server within the same fast network.

Some integrations (LAMs) benefit in being closer to the source so are a candidates for distribution.

See Server Roles and Distributed HA Installation for more information.

High Availability Overview

Cisco Crosswork Situation Manager supports high availability (HA) architectures to improve the fault tolerance of Cisco Crosswork Situation Manager. Each component supports a multi-node architecture to enable redundancy, failover, or both to minimize the risk of data loss, for example, in the case of a hardware failure

This topic covers the architectures you can use to achieve HA with Cisco Crosswork Situation Manager. For an example of how to set up a single site HA system, see Distributed HA Installation. See HA Reference Architecture for a detailed diagram of the components in a single site HA configuration.

Distributed HA architectures

Cisco Crosswork Situation Manager supports high availability in distributed architectures where different machines host a subset of the stack. You can run one or more of the server roles on its own machine.

See Server Roles for details of the HA architecture server roles in Cisco Crosswork Situation Manager.

If you run more than one server role on a machine, choose a primary role for the server. The primary role dictates which additional roles are supported on the machine as follows:

Primary Role	Supported Secondary Roles
Core	UI, Data ingestion and Database
UI	Data ingestion
Data Ingestion	UI
Database	Redundancy
Redundancy	Database

See Scale Your Cisco Crosswork Situation Manager Implementation for information on how to increase capacity within the HA architecture, you can.

Contact your Cisco technical representative to discuss scaling your deployment.

See Sizing Recommendations for more information on hardware sizes and capacity.

After you decide on the best HA architecture for your environment, you can plan your implementation.

Resilience and failover

Cisco Crosswork Situation Manager provides support for automatic failover between the two nodes within an HA pair. For example from one instance of Moogfarmd to another, or from one instance of a LAM to another. However there is no automatic failover between multiple HA pairs. For example, there is no failover from a primary site to a second site, such as a disaster recovery replica.

Cisco Crosswork Situation Manager does not support automated fail-back for any architecture. For example, consider an HA pair of Moogfarmd instances. When the instance of Moogfarmd in cluster 1 becomes unavailable, the instance in cluster 2 enters an active state. When the instance from cluster 1 recovers and becomes available, the instance in cluster 2 remains active.

High Availability Configuration Hierarchy

Cisco Crosswork Situation Manager deployments use a tiered hierarchy of clusters, groups, and instances to achieve High Availability.

A cluster is a collection of Cisco Crosswork Situation Manager components. To achieve HA you need at least two clusters that include all the Cisco Crosswork Situation Manager components. You need an additional, third machine, for message queue and search components.

A group comprises a single component or two identical components that provide resilience over two or more clusters. Cisco Crosswork Situation Manager automatically controls the active or passive behavior and failover of the instances within a group.

An example of a group is a Socket LAM configured for the same source in two separate clusters. Other groups include the following;

· Servlets for the UI.

· Moogfarmd for data processing.

· Individual LAMs for data ingestion. For example the REST LAM.

An instance is an individual component running within a group. Each instance in a group provides resilience for the other instance. For example the primary instance of a Socket LAM pairs with a secondary instance in the second cluster to make a group.

HA Reference Architecture

The diagram in this topic represents a Cisco Crosswork Situation Manager High Availability deployment to a single site: one datacenter, LAN, or availability zone. To support this architecture, all servers must have sufficient connection speed amongst themselves so that latency between hosts does not exceed 5 ms.

Related image, diagram or screenshot

A) Load balancers / VIPs

All Cisco Crosswork Situation Manager components have their own HA mechanism that provides failover capabilities , but it is also a best practice to use a load balancer or load balancers. You can use either software or hardware load balancers with the following requirements and recommendations:

· Load balancers must use TCP.

· You must implement health checks using your preferred approach to remove unhealthy servers from the cluster.

· The load balancer should provide load balancing capabilities and a VIP for each server role. For example: one UI VIP per site, one LAM VIP per site.

· Sticky sessions are recommended.

· You can choose your preferred load balancing approach. For example, round robin or least-connection.

B) User interface

The Cisco Crosswork Situation Manager UI comprises the following components:

· Nginx: The web server that provides static UI content and acts as a proxy for the application server. For HA deployments, install a minimum of two Nginx instances on separate servers and optionally cluster the Nginx instances.

· Apache Tomcat: The web server that provides servlet and API support. For HA deployments, install a minimum of two Tomcat instances on separate servers and optionally cluster the instances.

The UI components run in active/active configuration, so configure servlet instances to run in separate groups.

Required Ports: 80, 443

C) Database

Cisco Crosswork Situation Manager uses Percona XtraDB as the system database. HA requires a minimum of three server nodes configured in each cluster with latency between them not exceeding 5 ms.

Required Ports: 3306

D) Search and indexing

Cisco Crosswork Situation Manager uses Elasticsearch to store active alert and Situation data to provide search functionality within the product. For HA deployments install a cluster of a minimum of three data servers with one active master server.

Required Ports: 9200, 9300

E) Core data processing

Moogfarmd is the core data processing application that controls all other services in Cisco Crosswork Situation Manager. It manages the clustering algorithms and other applets (Moolets) that run as part of the system. For HA deployments, install a minimum of two Moogfarmd services on separate servers. Moogfarmd can only run as a two-instance group in an active/passive mode.

Required Ports: 5701, 8901 for Hazelcast: the in-memory data grid that provides fault tolerance.

F) Message Bus

Cisco Crosswork Situation Manager uses RabbitMQ as the system Message Bus. It requires a minimum of three servers for HA. RabbitMQ relies on its native clustering functionality and mirrored queues to handle failover; it does not use the Cisco Crosswork Situation Manager load balancing feature.

Required Ports: 5672, 4369, 15672

G) Data ingestion

Cisco Crosswork Situation Manager uses the following types of Link Access Modules (LAMs) to ingest data:

· Polling LAMs that periodically connect to a data source using an integration API to collect event data.

· Receiving LAMs that provide an endpoint for data sources to post event data.

· For HA deployments:

· Install two instances of each LAM. When both instances are in the same group, they run in active/passive mode.

· For LAMs deployed over an unreliable link such as a WAN, or across data centers, you should deploy a caching LAM strategy that includes a database and message queue on the LAM Servers.

· You can load balance receiving LAMs and configure them as active/active to increase capacity.

HA Architecture for LAMs

The Cisco Crosswork Situation Manager HA architecture provides increased resilience against LAM and server restarts by caching ingested data to the disk. It requires installing a local RabbitMQ cluster which is used by LAMs for publishing.

A remote caching LAM, located next to the Core role, connects to the local RabbitMQ cluster, picks the events from the queue and publishes them to the central RabbitMQ cluster for Moogfarmd to process.

If no caching LAM is available to consume the events from the local RabbitMQ cluster, the data is cached to disk until the server runs out of memory.

HA architecture

This architecture is recommended for hybrid installations, where the core processing is located in the cloud and LAMs are on-premise, or for a full on-premise configuration where LAMs are housed remotely to the core components.

Related image, diagram or screenshot

Polling LAMs run in an active / passive mode and must connect to a local database in order to negotiate their state. This requires a local MySQL instance that runs with master / master replication.

Related image, diagram or screenshot

Installation steps

If you are setting up the Store and Forward architecture, perform the following steps:

· Setup LAM 1 and 2 Roles (see Install with Caching LAM)

· Setup Caching LAM 1 and 2 Roles (see Caching LAM)

Otherwise, perform the standard installation steps:

· Setup LAM 1 and 2 Roles (see Install without Caching LAM)

High Availability for Third Party Component Dependencies

You can configure Cisco Crosswork Situation Manager dependencies such as Percona XTraDB Cluster, Elasticsearch, RabbitMQ, and Grafana to work effectively in highly available deployments.

See High Availability for details on high availability deployments of Cisco Crosswork Situation Manager and deployment scenarios.

Configure Percona XtraDB Cluster for HA

For information on Percona XtraDB Cluster in Cisco Crosswork Situation Manager, see /document/preview/120574#UUID816c7d74d05ed359780616a54d06a4d4. For an example configuration, see Set Up the Database for HA For further information, refer to the documentation about Percona XtraDB Cluster.Database Strategy

Configure RabbitMQ for HA

You can improve the performance and reliability of your Cisco Crosswork Situation Manager deployment by:

· Distributing your RabbitMQ brokers on different hosts.

· Clustering your multiple RabbitMQ brokers.

· Mirroring your message queues across multiple nodes.

See Set Up the Core Role for HA and Set Up the Redundancy Server Role for an example configuration. For more information see See Message System Deployment. Refer to the RabbitMQ documentation on Clustering and Mirrored Queues for more information.

Configure Elasticsearch for HA

There are different ways to configure Elasticsearch for distributed installations. See Set Up the Core Role for HA and Set Up the Redundancy Server Role for an example configuration.

Refer to the Elasticsearch documentation on Clustering for more details.

Configure Grafana for HA

To set up Grafana for distributed installations, you should configure each Grafana instance to connect to a Cisco Crosswork Situation Manager UI load balancer such as HA Proxy rather than the Cisco Crosswork Situation Manager UI stack.

Alternatively you can point it at the Apache Tomcat server or Nginx server. Refer to the Grafana documention on Setting Up Grafana for High Availability.

HA Control utility command reference

The Cisco Crosswork Situation Manager HA Control Utility ha_cntl is a command line utility to:

· Control instance, process group, or cluster failover. For example, to switch from passive to active mode.

· View the current status of all clusters, process groups, and instances. See High Availability Configuration Hierarchy for more information.

Normally you should configure groups in HA to use automatic failover in production. Use the HA Control utility to check the status of the HA system or to initiate failover in non-production scenarios.

Usage

Argument	Input	Description
-a, --activate	String <cluster[.group[.instance_name]]>	Activate all groups within a cluster, a specific group within a cluster, or a single instance.
-d, --deactivate	String <cluster[.group[.instance_name]]>	Deactivate all groups within a cluster, a specific group within a cluster or a single instance.
-i, --diagnostics	String <arg>	Print additional diagnostics where available to process log file.
-l,--loglevel	String, one of INFO \| WARN \| ALL	Log level controlling the amount of information logged by the utility.
-t,--time_out	String <number of seconds>	Amount of time in seconds to wait for the last answer. Defaults to 2.
-v,--view	-	View the current status of all instances, process groups, and clusters.
-y,--assumeyes	-	Answer "yes" for all prompts.

Example

$MOOGSOFT_HOME/bin/ha_cntl -v

Getting system status
Cluster: [SECONDARY] passive
   Process Group: [UI] Passive (no leader - all can be active)
      Instance: [servlets] Passive
         Component: moogpoller - not running
         Component: moogsvr - not running
         Component: toolrunner - not running
   Process Group: [moog_farmd] Passive (only leader should be active)
      Instance: FARM Passive Leader
          Moolet: AlertBuilder - not running (will run on activation)
          Moolet: AlertRulesEngine - not running (will run on activation)
          Moolet: Cookbook - not running (will run on activation)
          Moolet: Speedbird - not running (will run on activation)
          Moolet: TemplateMatcher - not running
   Process Group: [rest_lam] Passive (no leader - all can be active)
      Instance: REST2 Passive
   Process Group: [socket_lam] Passive (only leader should be active)
      Instance: SOCK2 Passive Leader
Cluster: [PRIMARY] active
   Process Group: [UI] Active (no leader - all can be active)
      Instance: [servlets] Active
         Component: moogpoller - running
         Component: moogsvr - running
         Component: toolrunner - running
   Process Group: [moog_farmd] Active (only leader should be active)
      Instance: FARM Active Leader
         Moolet: AlertBuilder - running
         Moolet: AlertRulesEngine - running
         Moolet: Cookbook - running
         Moolet: Default Cookbook - running
         Moolet: Speedbird - running
         Moolet: TemplateMatcher - not running
   Process Group: [rest_lam] Active (no leader - all can be active)
      Instance: REST1 Active
   Process Group: [socket_lam] Active (only leader should be active)
      Instance: SOCK1 Active Leader

Install Cisco Crosswork Situation Manager

Use this guide to learn how to install Cisco Crosswork Situation Manager:

If you are installing another version, see Welcome to the Cisco Docs! for more information. Refer to the following topics to help choose the right environment for your Cisco Crosswork Situation Manager deployment:

· The Cisco Crosswork Situation Manager Cisco Crosswork Situation Manager 7.3.0 Supported Environments topic details supported operating systems and system requirements.

· The Sizing Recommendations topic will help you make sure you select hardware to support your data ingestion and user requirements.

If you are upgrading Cisco Crosswork Situation Manager, see Upgrade Cisco Crosswork Situation Manager.

Deployment options

You have the option to install all Cisco Crosswork Situation Manager packages on a single machine. However, the modular approach of the Cisco Crosswork Situation Manager distribution means fewer dependencies between individual packages. This means you have the flexibility to install different components to different machines. See Server Roles for a description of how you can distribute the different components amongst multiple machines.

· For smaller deployments, you can run all the components in on a single machine.

— If you have root access to the machine and want to use Yum to install, see v7.3.x - RPM installation.

— If you do not have root access to the machine where you are installing and you want more control over where you install Cisco Crosswork Situation Manager, see v7.3.x - Tarball installation.

· For most production deployments, you may install different components to different machines in order to distribute the workload. See High Availability Overview for more information.

Pre-installation for Cisco Crosswork Situation Manager v7.3.x

Before you start to install Cisco Crosswork Situation Manager v7.3.x, you must perform certain pre-installation tasks.

The instructions to follow depends on your preferred mode of deployment:

· RPM: Use this method if you have root access to your Cisco Crosswork Situation Manager server(s) and you do not want to change the default installation locations.

· Tarball: Use this method if you need to run the process as a non-root user, or you want the ability to deploy to a non-default location and install all components under one directory.

The Tarball installer is hosted on the Cisco "speedy" Yum repository: https://speedy.moogsoft.com/installer/. Contact Cisco Support for access if you do not already have an account.

· Use the Offline RPM instructions if you have root access but your Cisco Crosswork Situation Manager server(s) do not have access to the internet.

For pre-installation instructions, refer to one of the following topics:

· RPM pre-installation for 7.3.x

· Tarball pre-installation for 7.3.x

· Offline RPM pre-installation for 7.3.x

Cisco Crosswork Situation Manager v7.3.x - RPM pre-installation steps

You must perform certain preparatory tasks before you install Cisco Crosswork Situation Manager v7.3.x.

Follow these steps if you have root access to the machine or machines on which you will install Cisco Crosswork Situation Manager, and you can connect to Yum repositories outside your network from those machines.

For Offline RPM pre-installation steps, see v7.3.x - Offline RPM pre-installation steps.

For Tarball pre-installation steps, see v7.3.x - Tarball pre-installation steps.

Before you begin

Before you begin to prepare for the installation, verify the following:

· You have root access to the system where you plan to install Cisco Crosswork Situation Manager.

· You have credentials to connect to the Cisco "speedy" Yum repository.

· You are familiar with the supported versions of third party software, as outlined in Cisco Crosswork Situation Manager 7.3.0 Supported Environments.

Pre-installation steps

Complete the following steps before you perform a RPM installation of Cisco Crosswork Situation Manager v7.3.x:

1. Create the Cisco Crosswork Situation Manager Yum repository as a new file /etc/yum.repos.d/aiops.repo with the following contents. Replace the login and password in the baseurl property with your Cisco "speedy" Yum repository credentials.

[moogsoft-aiops]
name=moogsoft-aiops
baseurl=https://<username>:<password>@speedy.moogsoft.com/repo/aiops/esr
enabled=1
gpgcheck=0
sslverify=0

2. Optional: GPG key validation of the RPMs

For users wishing to validate the RPMs before installation, the following steps must be followed:

a. Download the key from this site:

https://keys.openpgp.org/vks/v1/by-fingerprint/2529C94A49E42429EDAAADAEC7A2253BFC50512A

b. Copy the key to the server onto which the RPMs or tarball will be installed (it will be an .asc file)

c. Import the key:

gpg --import 2529C94A49E42429EDAAADAEC7A2253BFC50512A.asc

d. Download all the '7.3.0' RPMs and .sig files from the speedy yum repository using a browser, providing speedy credentials when asked by the browser:

https://<speedyusername>:<speedypassword>@speedy.moogsoft.com/repo/aiops/esr/x86_64

e. Move the RPMs and .sig files into the same folder. For example, /tmp, as used in the example below.

f. Copy the following code into a bash terminal and run it to perform the validation:

      while read RPM
do
    echo "Current RPM: $RPM"
    gpg --verify ${RPM}.sig ${RPM} 2>&1
done < <(find /tmp -name '*.rpm');

g. Confirm that all the commands for each RPM report:

Good signature from "Moogsoft Information Security Team "<security@moogsoft.com>"

h. You can now remove the RPMs and .sig files. Yum will download the packages from the online repository for the actual installation.

3. Create an Elasticsearch Yum repository as a new file /etc/yum.repos.d/elasticsearch.repo with the following contents:

[elasticsearch-6.x]
name=Elasticsearch repository for 6.x packages
baseurl=https://artifacts.elastic.co/packages/6.x/yum
gpgcheck=1
gpgkey=https://artifacts.elastic.co/GPG-KEY-elasticsearch
enabled=1
autorefresh=1
type=rpm-md

4. Install the RabbitMQ Erlang el7 package. For example:

yum -y install https://github.com/rabbitmq/erlang-rpm/releases/download/v20.1.7/erlang-20.1.7-1.el7.centos.x86_64.rpm

Alternatively you can find the file at https://github.com/rabbitmq/erlang-rpm/releases/tag/v20.1.7.

5. Install the RabbitMQ Yum repository. For example:

curl -s https://packagecloud.io/install/repositories/rabbitmq/rabbitmq-server/script.rpm.sh | sudo bash

Verify that the /etc/yum.repos.d/rabbitmq_rabbitmq-server.repo file has been created.

6. Install the Elasticsearch public key. For example:

rpm --import https://artifacts.elastic.co/GPG-KEY-elasticsearch

7. Create a Bash script named create_nginx_repo.sh with the following contents:

     #!/bin/bash

echo '[nginx]' > /etc/yum.repos.d/nginx.repo
echo 'name=nginx repo' >> /etc/yum.repos.d/nginx.repo
echo 'baseurl=http://nginx.org/packages/OS/OSRELEASE/$basearch/' >> /etc/yum.repos.d/nginx.repo
echo 'gpgcheck=0' >> /etc/yum.repos.d/nginx.repo
echo 'enabled=1' >> /etc/yum.repos.d/nginx.repo

OS_VERSION=$(cat /etc/system-release)
case "$OS_VERSION" in
    CentOS*release\ 7* )
        sed -i -e 's/OS/centos/' -e 's/OSRELEASE/7/' /etc/yum.repos.d/nginx.repo;;
    Red\ Hat*release\ 7* )
        sed -i -e 's/OS/rhel/' -e 's/OSRELEASE/7/' /etc/yum.repos.d/nginx.repo;;
esac

8. Execute the create_nginx_repo.sh script to create the Nginx Yum repo. For example:

bash create_nginx_repo.sh

9. Refresh the local Yum repo cache and verify that the NSS and OpenSSL packages are up to date on your system. For example:

yum clean all
yum -y update nss openssl

10. Install Java 11:

yum -y install java-11-openjdk-headless-11.0.2.7 java-11-openjdk-11.0.2.7 java-11-openjdk-devel-11.0.2.7

11. Install the Extra Packages for Enterprise Linux (EPEL) Yum repository and enable the optional packages:

yum install https://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm

Verify that the /etc/yum.repos.d/epel.repo file has been created.

12. If the Operating System is RHEL, enable the 'extras' repos:

subscription-manager repos --enable "rhel-*-optional-rpms" --enable "rhel-*-extras-rpms" --enable "rhel-ha-for-rhel-*-server-rpms"

13. Install Percona on all servers that will house a database node. The script configures multiple nodes to run as a cluster. A single node is also supported. This command should be run on an internet-connected host in order to download the installer. If this host does not have internet access, download the script on a different host before copying it to this host. Substitute your "speedy" Yum repo user credentials:

cat > aiops_repo.sh << _EOF_
#!/bin/bash
clear
echo "Please provide access credentials for the 'speedy' yum repository in order to download the Percona setup script"
echo
read -p "AIOps Repository Username: " AIOPS_USER
export AIOPS_USER
read -p "AIOps Repository Password: " -s AIOPS_PASS
export AIOPS_PASS
curl -L -O https://\$AIOPS_USER:\$AIOPS_PASS@speedy.moogsoft.com/repo/aiops/install_percona_nodes.sh 2>/dev/null
echo
_EOF_
bash aiops_repo.sh;

Now run the script:

bash install_percona_nodes.sh

The script guides you through the installation process. To configure a single database node on the same server as Cisco Crosswork Situation Manager, use these settings:

— Configure Percona as "Primary".

— Do not set the server to "DB only".

— Set the first database node IP address to the server IP address.

— When prompted to enter the IP addresses of the second and third nodes, press Enter to skip these settings.

14. Set SELinux to permissive mode or disable it completely. For example, to set SELinux to permissive mode:

setenforce 0

If you want to disable SELinux at boot time, edit the file /etc/sysconfig/selinux.

After you have completed these steps, proceed with your offline installation or upgrade. See Upgrade Cisco Crosswork Situation Manager for the instructions relevant to your deployment.

Cisco Crosswork Situation Manager v7.3.x - Offline RPM pre-installation steps

You must perform certain preparatory tasks before you install Cisco Crosswork Situation Manager v7.3.x.

Follow these steps if you have root access to the machine or machines on which you will install or upgrade Cisco Crosswork Situation Manager, but you cannot connect to Yum repositories outside your network from those machines.

For RPM pre-installation steps, see v7.3.x - RPM pre-installation steps.

For Tarball pre-installation steps, see v7.3.x - Tarball pre-installation steps.

Before you begin

Before you begin to prepare for the installation, verify the following:

· You have root access to the system where you plan to install Cisco Crosswork Situation Manager.

· You are familiar with the supported versions of third party software, as outlined in Cisco Crosswork Situation Manager 7.3.0 Supported Environments.

Download the installation files

Complete the following steps before you perform an offline RPM installation of Cisco Crosswork Situation Manager v7.3.x:

1. Download the two archives required for the offline installation, using the following links:

— The BASE repository containing the dependent packages to install for RHEL/CentOS 7:

https://speedy.moogsoft.com/offline/aiops/2019-10-03-1570143328-MoogsoftBASE7_offline_repo.tar.gz

— The ESR repository containing the standard RPMs and ancillary packages (Apache Tomcat, RabbitMQ, JRE, etc):

https://speedy.moogsoft.com/offline/aiops/2019-10-03-1570143328-MoogsoftESR_7.3.0_offline_repo.tar.gz

2. Copy the downloaded Tarball files to your offline system.

3. Download the Percona and dependency packages using cURL on an internet-connection host:

curl -L -O http://repo.percona.com/percona/yum/release/7/RPMS/x86_64/Percona-XtraDB-Cluster-shared-57-5.7.26-31.37.1.el7.x86_64.rpm;
curl -L -O http://repo.percona.com/percona/yum/release/7/RPMS/x86_64/Percona-XtraDB-Cluster-client-57-5.7.26-31.37.1.el7.x86_64.rpm;
curl -L -O http://repo.percona.com/percona/yum/release/7/RPMS/x86_64/Percona-XtraDB-Cluster-server-57-5.7.26-31.37.1.el7.x86_64.rpm;
curl -L -O http://repo.percona.com/percona/yum/release/7/RPMS/x86_64/Percona-XtraDB-Cluster-shared-compat-57-5.7.26-31.37.1.el7.x86_64.rpm;
curl -L -O http://repo.percona.com/percona/yum/release/7/RPMS/x86_64/percona-xtrabackup-24-2.4.15-1.el7.x86_64.rpm;

Prepare the local Yum repositories

Follow these steps to create local Yum repositories to house the installation packages. If you are running a distributed installation, perform these steps on each machine that will run Cisco Crosswork Situation Manager components.

1. Create two directories to house the repositories. For example:

sudo mkdir -p /media/localRPM/BASE/

sudo mkdir -p /media/localRPM/ESR/

2. Extract the two Tarball files into separate directories. For example:

tar xzf *-MoogsoftBASE7_offline_repo.tar.gz -C /media/localRPM/BASE/

tar xzf *-MoogsoftESR_7.3.0_offline_repo.tar.gz -C /media/localRPM/ESR/

3. Back up the existing /etc/yum.repos.d directory. For example:

mv /etc/yum.repos.d /etc/yum.repos.d-backup

4. Create an empty /etc/yum.repos.d directory. For example:

mkdir /etc/yum.repos.d

5. Create a local.repo file ready to contain the local repository details:

vi /etc/yum.repos.d/local.repo

6. Edit local.repo and configure the baseurl paths for BASE and ESR to point to the your directories. For example:

[BASE]
name=MoogCentOS-$releasever - MoogRPM
baseurl=file:///media/localRPM/BASE/RHEL
gpgcheck=0
enabled=1
[ESR]
name=MoogCentOS-$releasever - MoogRPM
baseurl=file:///media/localRPM/ESR/RHEL
gpgcheck=0
enabled=1

7. Clean the Yum cache:

yum clean all

8. Verify that Yum can detect the newly created repositories. For example:

     yum info "moogsoft-*"

Available Packages
Arch        : x86_64
Version     : 7.3.0
Release     : XYZ
Size        : 76 M
Repo        : ESR
Summary     : Algorithmic Intelligence for IT Operations
URL         : https://www.moogsoft.com
License     : Proprietary
Description : Moogsoft AIOps (7.3.0) - Build: XYZ - (Revision: XYZ)

The results should include the following packages:

     Name        : moogsoft-db
Name        : moogsoft-integrations
Name        : moogsoft-integrations-ui
Name        : moogsoft-mooms
Name        : moogsoft-search
Name        : moogsoft-server
Name        : moogsoft-ui
Name        : moogsoft-utils
Name        : moogsoft-common

9. Install Percona on all servers that will house a database node. The script configures multiple nodes to run as a cluster. A single node is also supported. This command should be run on an internet-connected host in order to download the installer. If this host does not have internet access, download the script on a different host before copying it to this host.

Ensure that any host Percona will be installed on has the Percona RPMs (detailed above) in the current directory, and the offline yum repository has also been deployed

Substitute your "speedy" Yum repo user credentials:

Now run the script:

bash install_percona_nodes.sh;

The script guides you through the installation process. To configure a single database node on the same server as Cisco Crosswork Situation Manager, use these settings:

— Configure Percona as "Primary".

— Do not set the server to "DB only".

— Set the first database node IP address to the server IP address.

— When prompted to enter the IP addresses of the second and third nodes, press Enter to skip these settings.

10. Install Java 11:

yum -y install java-11-openjdk-headless-11.0.2.7 java-11-openjdk-11.0.2.7 java-11-openjdk-devel-11.0.2.7

11. Set SELinux to permissive mode or disable it completely. For example, to set SELinux to permissive mode:

setenforce 0

If you want to disable SELinux at boot time, edit the file /etc/sysconfig/selinux.

12. Optional: GPG key validation of the RPMs

For users wishing to validate the RPMs before installation, the following steps must be followed:

a. Download the key from this site:

https://keys.openpgp.org/vks/v1/by-fingerprint/2529C94A49E42429EDAAADAEC7A2253BFC50512A

b. Copy the key to the server onto which the RPMs or tarball will be installed (it will be an .asc file)

c. Import the key:

gpg --import 2529C94A49E42429EDAAADAEC7A2253BFC50512A.asc

d. Download all the '7.3.0' RPMs and .sig files from the speedy yum repository using a browser, providing speedy credentials when asked by the browser:

https://<speedyusername>:<speedypassword>@speedy.moogsoft.com/repo/aiops/esr/x86_64

e. Move the RPMs and .sig files into the same folder. For example, /tmp, as used in the example below.

f. Copy the following code into a bash terminal and run it to perform the validation:

      while read RPM
do
    echo "Current RPM: $RPM"
    gpg --verify ${RPM}.sig ${RPM} 2>&1
done < <(find /media/localRPM/ESR/RHEL/ -name '*.rpm');

g. Confirm that all the commands for each RPM report:

Good signature from "Moogsoft Information Security Team "<security@moogsoft.com>"

Your local Yum repositories are now ready. Proceed with your offline installation or upgrade. See Upgrade Cisco Crosswork Situation Manager for the instructions relevant to your deployment.

Cisco Crosswork Situation Manager v7.3.x - Tarball pre-installation steps

You must perform certain preparatory tasks before you install Cisco Crosswork Situation Manager v7.3.x.

Follow these steps if you do not have root access to the machine or machines on which you will install Cisco Crosswork Situation Manager.

For RPM pre-installation steps, see v7.3.x - RPM pre-installation steps.

For Offline RPM pre-installation steps, see v7.3.x - Offline RPM pre-installation steps.

Before you begin

Before you begin to prepare for the installation, verify the following:

· You have a CentOS 7 / RHEL 7 system on which to install Cisco Crosswork Situation Manager.

· You have removed any existing environment variables such as $MOOGSOFT_HOME from previous installations.

· You have identified the Linux user you will use to perform the installation.

· Optional: Ask an administrator to set the ulimit maximum for open files and max user processes for the installation user. This requires root privileges. For example, on a busy system you could increase both to 65535.

· You have selected a working directory in which to run the installation. The installation directory requires a minimum of 7Gb, and more if you are storing the Percona database in the installation directory, to allow for database and Elasticsearch artefacts and log file growth.

· You have credentials to connect to the Cisco "speedy" Yum repository.

· You are familiar with the supported versions of third party software, as outlined in Cisco Crosswork Situation Manager 7.3.0 Supported Environments.

· Ports 8443 and 8080 are open on your server.

· You are running OpenSSL v1.0.2 or later.

Pre-installation steps

Before you perform a Tarball installation of Cisco Crosswork Situation Manager v7.3.x, complete the following tasks on the server on which you will install Cisco Crosswork Situation Manager:

1. Install Kernel Asynchronous I/O (AIO) Support for Linux and libgfortran. For example:

     mkdir -p ~/install/libraries/ && cd ~/install/libraries/
for PACKAGE in libquadmath-4.8.5-39.el7.x86_64.rpm libgfortran-4.8.5-39.el7.x86_64.rpm; do
    curl -L -O http://mirror.centos.org/centos/7/os/x86_64/Packages/$PACKAGE && \
    rpm2cpio $PACKAGE | cpio -idmv && \
    rm -f $PACKAGE
done
echo "export LD_LIBRARY_PATH=$(pwd)/usr/lib64:\$LD_LIBRARY_PATH" >> ~/.bashrc && \
source ~/.bashrc

cd -

2. Install Percona dependencies. This step requires root privileges:

curl -L -O http://repo.percona.com/percona/yum/release/7/RPMS/x86_64/qpress-11-1.el7.x86_64.rpm;
curl -L -O http://mirrors.vooservers.com/centos/7.6.1810/extras/x86_64/Packages/libev-4.15-7.el7.x86_64.rpm;
curl -L -O http://mirrors.clouvider.net/CentOS/7.6.1810/updates/x86_64/Packages/perl-5.16.3-294.el7_6.x86_64.rpm;
curl -L -O http://mirrors.clouvider.net/CentOS/7.6.1810/updates/x86_64/Packages/perl-Pod-Escapes-1.04-294.el7_6.noarch.rpm;
curl -L -O http://mirrors.clouvider.net/CentOS/7.6.1810/updates/x86_64/Packages/perl-libs-5.16.3-294.el7_6.x86_64.rpm;
curl -L -O http://mirrors.clouvider.net/CentOS/7.6.1810/updates/x86_64/Packages/perl-macros-5.16.3-294.el7_6.x86_64.rpm;
curl -L -O http://centos.serverspace.co.uk/centos/7.6.1810/os/x86_64/Packages/xinetd-2.3.15-13.el7.x86_64.rpm;
curl -L -O http://mirror.as29550.net/mirror.centos.org/7.6.1810/os/x86_64/Packages/perl-Carp-1.26-244.el7.noarch.rpm;
curl -L -O http://mirror.as29550.net/mirror.centos.org/7.6.1810/os/x86_64/Packages/perl-Compress-Raw-Bzip2-2.061-3.el7.x86_64.rpm;
curl -L -O http://mirror.as29550.net/mirror.centos.org/7.6.1810/os/x86_64/Packages/perl-Compress-Raw-Zlib-2.061-4.el7.x86_64.rpm;
curl -L -O http://mirror.as29550.net/mirror.centos.org/7.6.1810/os/x86_64/Packages/perl-DBD-MySQL-4.023-6.el7.x86_64.rpm;
curl -L -O http://mirror.as29550.net/mirror.centos.org/7.6.1810/os/x86_64/Packages/perl-DBI-1.627-4.el7.x86_64.rpm;
curl -L -O http://mirror.as29550.net/mirror.centos.org/7.6.1810/os/x86_64/Packages/perl-Data-Dumper-2.145-3.el7.x86_64.rpm;
curl -L -O http://mirror.as29550.net/mirror.centos.org/7.6.1810/os/x86_64/Packages/perl-Digest-1.17-245.el7.noarch.rpm;
curl -L -O http://mirror.as29550.net/mirror.centos.org/7.6.1810/os/x86_64/Packages/perl-Digest-MD5-2.52-3.el7.x86_64.rpm;
curl -L -O http://mirror.as29550.net/mirror.centos.org/7.6.1810/os/x86_64/Packages/perl-Encode-2.51-7.el7.x86_64.rpm;
curl -L -O http://mirror.as29550.net/mirror.centos.org/7.6.1810/os/x86_64/Packages/perl-Exporter-5.68-3.el7.noarch.rpm;
curl -L -O http://mirror.as29550.net/mirror.centos.org/7.6.1810/os/x86_64/Packages/perl-File-Path-2.09-2.el7.noarch.rpm;
curl -L -O http://mirror.as29550.net/mirror.centos.org/7.6.1810/os/x86_64/Packages/perl-File-Temp-0.23.01-3.el7.noarch.rpm;
curl -L -O http://mirror.as29550.net/mirror.centos.org/7.6.1810/os/x86_64/Packages/perl-Filter-1.49-3.el7.x86_64.rpm;
curl -L -O http://mirror.as29550.net/mirror.centos.org/7.6.1810/os/x86_64/Packages/perl-Getopt-Long-2.40-3.el7.noarch.rpm;
curl -L -O http://mirror.as29550.net/mirror.centos.org/7.6.1810/os/x86_64/Packages/perl-HTTP-Tiny-0.033-3.el7.noarch.rpm;
curl -L -O http://mirror.as29550.net/mirror.centos.org/7.6.1810/os/x86_64/Packages/perl-IO-Compress-2.061-2.el7.noarch.rpm;
curl -L -O http://mirror.as29550.net/mirror.centos.org/7.6.1810/os/x86_64/Packages/perl-Net-Daemon-0.48-5.el7.noarch.rpm;
curl -L -O http://mirror.as29550.net/mirror.centos.org/7.6.1810/os/x86_64/Packages/perl-PathTools-3.40-5.el7.x86_64.rpm;
curl -L -O http://mirror.as29550.net/mirror.centos.org/7.6.1810/os/x86_64/Packages/perl-PlRPC-0.2020-14.el7.noarch.rpm;
curl -L -O http://mirror.as29550.net/mirror.centos.org/7.6.1810/os/x86_64/Packages/perl-Pod-Perldoc-3.20-4.el7.noarch.rpm;
curl -L -O http://mirror.as29550.net/mirror.centos.org/7.6.1810/os/x86_64/Packages/perl-Pod-Simple-3.28-4.el7.noarch.rpm;
curl -L -O http://mirror.as29550.net/mirror.centos.org/7.6.1810/os/x86_64/Packages/perl-Pod-Usage-1.63-3.el7.noarch.rpm;
curl -L -O http://mirror.as29550.net/mirror.centos.org/7.6.1810/os/x86_64/Packages/perl-Scalar-List-Utils-1.27-248.el7.x86_64.rpm;
curl -L -O http://mirror.as29550.net/mirror.centos.org/7.6.1810/os/x86_64/Packages/perl-Socket-2.010-4.el7.x86_64.rpm;
curl -L -O http://mirror.as29550.net/mirror.centos.org/7.6.1810/os/x86_64/Packages/perl-Storable-2.45-3.el7.x86_64.rpm;
curl -L -O http://mirror.as29550.net/mirror.centos.org/7.6.1810/os/x86_64/Packages/perl-Text-ParseWords-3.29-4.el7.noarch.rpm;
curl -L -O http://mirror.as29550.net/mirror.centos.org/7.6.1810/os/x86_64/Packages/perl-Time-HiRes-1.9725-3.el7.x86_64.rpm;
curl -L -O http://mirror.as29550.net/mirror.centos.org/7.6.1810/os/x86_64/Packages/perl-Time-Local-1.2300-2.el7.noarch.rpm;
curl -L -O http://mirror.as29550.net/mirror.centos.org/7.6.1810/os/x86_64/Packages/perl-constant-1.27-2.el7.noarch.rpm;
curl -L -O http://mirror.as29550.net/mirror.centos.org/7.6.1810/os/x86_64/Packages/perl-parent-0.225-244.el7.noarch.rpm;
curl -L -O http://mirror.as29550.net/mirror.centos.org/7.6.1810/os/x86_64/Packages/perl-podlators-2.5.1-3.el7.noarch.rpm;
curl -L -O http://mirror.as29550.net/mirror.centos.org/7.6.1810/os/x86_64/Packages/perl-threads-1.87-4.el7.x86_64.rpm;
curl -L -O http://mirror.as29550.net/mirror.centos.org/7.6.1810/os/x86_64/Packages/perl-threads-shared-1.43-6.el7.x86_64.rpm;
curl -L -O http://mirror.as29550.net/mirror.centos.org/7.6.1810/os/x86_64/Packages/socat-1.7.3.2-2.el7.x86_64.rpm;
curl -L -O http://mirror.sov.uk.goscomb.net/centos/7.6.1810/updates/x86_64/Packages/rsync-3.1.2-6.el7_6.1.x86_64.rpm;
curl -L -O http://mirror.sov.uk.goscomb.net/centos/7.6.1810/os/x86_64/Packages/lsof-4.87-6.el7.x86_64.rpm
yum install *.rpm

3. Install Percona on all servers that will house a database node. The script configures multiple nodes to run as a cluster. A single node is also supported. Substitute your "speedy" Yum repo user credentials:

cat > aiops_repo.sh << _EOF_
#!/bin/bash
clear
echo "Please provide access credentials for the 'speedy' yum repository in order to run the Percona setup script"
echo
read -p "AIOps Repository Username: " AIOPS_USER
export AIOPS_USER
read -p "AIOps Repository Password: " -s AIOPS_PASS
export AIOPS_PASS
curl -L -O https://\$AIOPS_USER:\$AIOPS_PASS@speedy.moogsoft.com/repo/aiops/install_percona_nodes_tarball.sh 2>/dev/null
echo
_EOF_
bash aiops_repo.sh

Now run the script:

bash install_percona_nodes_tarball.sh

The script guides you through the installation process. To configure a single database node on the same server as Cisco Crosswork Situation Manager, use these settings:

— Configure Percona as "Primary".

— Do not set the server to "DB only".

— Set the first database node IP address to the server IP address.

— When prompted to enter the IP addresses of the second and third nodes, press Enter to skip these settings.

4. Execute the .bashrc file:

source ~/.bashrc

The pre-installation steps are now complete. To continue with the Cisco Crosswork Situation Manager installation, see v7.3.x - Tarball installation.

RPM installation

This topic describes how to install Cisco Crosswork Situation Manager v7.3.x on a single host.

To install Cisco Crosswork Situation Manager in a highly available distributed environment, see Distributed HA Installation.

For Tarball installation steps, see v7.3.x - Tarball installation.

Before you begin

Before you start to install Cisco Crosswork Situation Manager, complete all steps in one of the following documents:

· v7.3.x - RPM pre-installation steps: If you have root access to the machine or machines on which you will install Cisco Crosswork Situation Manager, and you can connect to Yum repositories outside your network from those machines.

· v7.3.x - Offline RPM pre-installation steps: If you have root access to the machine or machines on which you will install or upgrade Cisco Crosswork Situation Manager, but you cannot connect to Yum repositories outside your network from those machines.

Install Cisco Crosswork Situation Manager

To complete an RPM installation of Cisco Crosswork Situation Manager v7.3.x, perform the following steps:

1. Download and install the Cisco Crosswork Situation Manager RPM packages, using one of the following methods according to your deployment type:

— If you are performing an RPM installation:

yum -y install moogsoft-server-7.3.0 \
moogsoft-db-7.3.0 \
moogsoft-utils-7.3.0 \
moogsoft-search-7.3.0 \
moogsoft-ui-7.3.0 \
moogsoft-common-7.3.0 \
moogsoft-mooms-7.3.0 \
moogsoft-integrations-7.3.0 \
moogsoft-integrations-ui-7.3.0

— If you are performing an offline RPM installation, navigate to the location where you copied the RPM files and install them:

yum install *.rpm

2. Edit your ~/.bashrc file to contain the following lines:

export MOOGSOFT_HOME=/usr/share/moogsoft
export APPSERVER_HOME=/usr/share/apache-tomcat
export JAVA_HOME=/usr/java/latest
export PATH=$PATH:$MOOGSOFT_HOME/bin:$MOOGSOFT_HOME/bin/utils

Initialize Cisco Crosswork Situation Manager

When the installation process is complete, initialize Cisco Crosswork Situation Manager as follows:

1. Run the initialization script moog_init (replace <zone name> with your desired RabbitMQ VHOST):

$MOOGSOFT_HOME/bin/utils/moog_init.sh -qI <zone_name> -u root

The script prompts you to accept the End User License Agreement (EULA) and guides you through the initialization process.

The zone_name sets up a virtual host for the Message Bus. If you have multiple systems sharing the same bus, set a different zone name for each.

2. If you are deploying more than one database, configure HA Proxy to load-balance the database nodes. The following script requires root privileges. Run this script on any host running any Cisco Crosswork Situation Manager components. Provide your "speedy" Yum repo user credentials when prompted:

cat > aiops_repo.sh << _EOF_
#!/bin/bash
clear
echo "Please provide access credentials for the 'speedy' yum repository in order to run the haproxy setup script"
echo
read -p "AIOps Repository Username: " AIOPS_USER
export AIOPS_USER
read -p "AIOps Repository Password: " -s AIOPS_PASS
export AIOPS_PASS
echo
bash <(curl -s -k https://\$AIOPS_USER:\$AIOPS_PASS@speedy.moogsoft.com/repo/aiops/haproxy_installer.sh)
_EOF_
bash aiops_repo.sh

3. Restart Moogfarmd and Apache Tomcat:

service moogfarmd restart

Run an unattended installation

The moog_init script provides the ability to run a 'quiet' installation and to automatically accept the terms of the EULA. This means that you can write a bash script to automatically execute both the installation script and the initialization script on the same host. For example:

~/moogsoft-aiops-install-7.3.0.sh \
-d ~/moogsoft &&

~/moogsoft/bin/utils/moog_init.sh \
-qI MoogsoftAIOps -p MySQLpasswd -u root --accept-eula

Run moog_init.sh -h for a description of all options.

Verify the installation

To verify that the installation has completed successfully, follow the steps outlined in Validate the installation.

Change passwords for default users

When the installation is complete, it is critical that you change the passwords for the default users created during the installation process. See Change passwords for default users for more information.

Tarball installation

This topic describes how to install Cisco Crosswork Situation Manager on a single host using the tarball archives, previously known as the non-root install.

Follow these steps if you do not have root access to the machine or machines on which you will install Cisco Crosswork Situation Manager.

To install Cisco Crosswork Situation Manager in a highly available distributed environment, see Distributed HA Installation.

For RPM installation steps, see /document/preview/11638#UUIDc3421ecf858260d943609c8ad4af0f4c.Prepare for an RPM Installation

Before you begin

Before you start to install Cisco Crosswork Situation Manager, complete all steps in the following document:

v7.3.x - Tarball pre-installation steps.

Install Cisco Crosswork Situation Manager

To complete a Tarball installation of Cisco Crosswork Situation Manager v7.3.x, perform the following steps:

1. Download the tarball installer, using one of the following options:

— Download via a web browser from https://speedy.moogsoft.com/installer and user the Yum user credentials provided by Cisco support.

— Use the following cURL command, substituting your "speedy" Yum repo user credentials:

curl -L -O "https://<username>:<password>@speedy.moogsoft.com/installer/moogsoft-aiops-7.3.0.tgz"

2. Optional: GPG key validation of the Tarball

For users wishing to validate the Tarball before installation, the following steps must be followed:

a. Download the key from this site:

https://keys.openpgp.org/vks/v1/by-fingerprint/2529C94A49E42429EDAAADAEC7A2253BFC50512A

b. Copy the key to the server onto which the Tarball will be installed (it will be a .asc file)

c. Import the key:

gpg --import 2529C94A49E42429EDAAADAEC7A2253BFC50512A.asc

d. Download the moogsoft-aiops-7.3.0.tgz.sig file from the same 'speedy' path:

curl -L -O "https://<username>:<password>@speedy.moogsoft.com/installer/moogsoft-aiops-7.3.0.tgz.sig"

e. Ensure both the tgz and the .sig file are both in the same folder, then copy the following command into a bash terminal and run it to perform the validation:

gpg --verify moogsoft-aiops-7.3.0.tgz.sig moogsoft-aiops-7.3.0.tgz

f. Confirm that the report states:

Good signature from "Moogsoft Information Security Team "<security@moogsoft.com>"

3. Unzip and untar the Cisco Crosswork Situation Manager distribution archive in your working directory:

tar -xf moogsoft-aiops-7.3.0.tgz

The distribution archive contains the following files:

— A README.txt file

— The installation script: moogsoft-aiops-install-7.3.0.sh

— The distribution archive: moogsoft-aiops-dist-7.3.0.tgz

4. Execute the installation script moogsoft-aiops-install-7.3.0.sh in your working directory to install Cisco Crosswork Situation Manager.

bash moogsoft-aiops-install-7.3.0.sh

The script guides you through the installation process. The installation directory defaults to <working-directory>/Cisco. You can change this if you wish.

5. Set the $MOOGSOFT_HOME environment variable to point to your installation directory, and add $MOOGSOFT_HOME/bin/utils to the path. For example:

echo "export MOOGSOFT_HOME=~/moogsoft" >> ~/.bashrc
echo "export PATH=$PATH:\$MOOGSOFT_HOME/bin/utils" >> ~/.bashrc && \
source ~/.bashrc

Initialize Cisco Crosswork Situation Manager

When the installation process is complete, initialize Cisco Crosswork Situation Manager as follows:

1. Configure the Toolrunner to execute locally by setting "execute_locally: true" in $MOOGSOFT_HOME/config/servlets.conf:

sed -i 's/# execute_locally: false,/,execute_locally: true/1' $MOOGSOFT_HOME/config/servlets.conf

2. Run the initialization script moog_init:

$MOOGSOFT_HOME/bin/utils/moog_init.sh -qI <zone_name> -u root

The script prompts you to accept the End User License Agreement (EULA) and guides you through the initialization process.

The zone_name sets up a virtual host for the Message Bus. If you have multiple systems sharing the same bus, set a different zone name for each.

To set processes to restart when the system is rebooted (Percona, RabbitMQ, Moogfarmd etc) use the -k flag. For example:

$MOOGSOFT_HOME/bin/utils/moog_init.sh -k

For more information see Configure Services to Restart.

3. If you are deploying more than one database, configure HA Proxy to load-balance the database nodes. The following script requires root privileges. Run this script on any host running any Cisco Crosswork Situation Manager components. Provide your "speedy" Yum repo user credentials when prompted:

Then run the script:

bash install_percona_nodes_tarball.sh

4. Restart Moogfarmd and Apache Tomcat:

$MOOGSOFT_HOME/bin/utils/process_cntl moog_farmd restart
$MOOGSOFT_HOME/bin/utils/process_cntl apache-tomcat restart

Run an unattended installation

~/moogsoft-aiops-install-7.3.0.sh \
-d ~/moogsoft &&

~/moogsoft/bin/utils/moog_init.sh \
-qI MoogsoftAIOps -p MySQLpasswd -u root --accept-eula

Run moog_init.sh -h for a description of all options.

Verify the installation

To verify that the installation has completed successfully, follow the steps outlined in Validate the installation.

Change passwords for default users

Distributed HA Installation

The following includes the installation steps for a fully distributed system running with HA, as illustrated in the following diagram.

Note that prior to performing a distributed HA installation, you must complete Pre-installation for Cisco Crosswork Situation Manager v7.3.x.

To view a list of connectivity ports for a fully distributed HA architecture see Distributed HA system Firewall.

Related image, diagram or screenshot

The installation assumes a HA configuration across 2 clusters called Cluster1 and Cluster2.

Note that both Core instances and polling LAMs are part of the same respective Cisco Crosswork Situation Manager process group since they run in an active / passive configuration with auto-failover enabled.

UI stacks, as well as receiving LAMs, should run as part of two distinct Cisco Crosswork Situation Manager process groups as both instances in the HA pair are active.

Install a fully distributed HA system

1. Set up Percona XTRA DB Cluster. See Set Up the Database for HA for more information.

2. Set up Core 1 and 2 roles. See Set Up the Core Role for HA for more information.

3. Set up UI 1 and 2 roles. See Set Up the User Interface Role for HA for more information.

4. Set up LAM 1 and 2 roles. See Install without Caching LAM for more information.

5. (Optional) set up Caching LAM 1 and 2 roles.

Install a minimally distributed HA system

For any other minimally distributed HA setup, follow the same high level installation steps as described above.

The instructions list the steps for a specific role installation. If you need to collocate multiple roles on the same server according to a minimally distributed installation of your choice, you may need to run multiple sets of instructions on the same server for the corresponding collocated roles. There might be an overlap in terms of steps and if this is the case you only need to perform those steps once. For instance, if you collocate Core 1 and UI 1 roles, you only need to configure HA Proxy once.

Distributed HA system Firewall

Connectivity within a fully distributed HA architecture:

Source	Destination	Ports	Bi-directional
UI 1, UI 2	Core 1, Core 2	5672,9200	-
UI 1, UI 2	RedServ	5672,9200	-
UI 1, UI 2	DB 1, DB 2, DB 3	3306,9198	-
Core 1	Core 2	5701,9300,4369,5672	Yes
Core 1, Core 2	RedServ	9300, 4369, 5672	Yes
Core 1, Core 2	DB 1, DB 2, DB 3	3306, 9198	-
LAM 1, LAM 2	Core 1, Core 2, RedServ	5672	-
LAM 1, LAM 2	DB 1, DB 2, DB 3	3306, 9198	-
DB 1	DB 2, DB 3	3306, 4567, 4444, 5468	Yes

If any of the default ports are changed then substitute it in the tables above. The ports are responsible for the following:

9200	Used for inbound Elastic Search REST API
9300	Used for Elastic nodes communication within a cluster
5672	Access to mooms bus (RabbitMQ)
15672	Access to mooms (RabbitMQ) console
4369	Required for mooms (RabbitMQ) cluster
5701	Required for Hazelcast cluster
8091	Access the Hazelcast cluster info via Hazelcast's
3306	Regular MySQL port
4567	For group communication in Percona XtraDB Cluster
4444	For State Snapshot Transfer in Percona XtraDB Cluster
4568	For Incremental State Transfer in Percona XtraDB Cluster
9198	Allows HAProxy to check the node's Percona XtraDB Cluster status via http

See Distributed HA Installation for the full installation steps for a fully distributed system running with HA.

Set Up the Database for HA

The database layer Cisco Crosswork Situation Manager for HA uses the Percona XtraDB Cluster mechanism.

See /document/preview/120574#UUID816c7d74d05ed359780616a54d06a4d4 for more information about the supported database platform.Database Strategy

HA architecture

In our distributed HA installation, the database components are installed on servers 3, 4, and 5:

Related image, diagram or screenshot

The roles are installed as follows:

Server 3: DB 1.

Server 4: DB 2.

Server 5: DB 3.

Fully distributed installation

See Distributed HA Installation for a reference diagram and steps to achieve a fully distributed installation.

Minimally distributed installation

For a minimally distributed installation follow the instructions below, replacing server 3, server 4 and server 5 with the relevant values for your architecture.

Build a Percona cluster

The sections below detail how to build a Percona XtraDB cluster.

Install Cisco Crosswork Situation Manager components on DB 1, DB 2, DB 3

On servers 3, 4 and 5, install the following Cisco Crosswork Situation Manager components:

yum -y install moogsoft-utils-7.3* \
moogsoft-db-7.3*

Set up Percona on DB 1

On server 3, run install_percona_nodes.sh to install, configure and start Percona Cluster node 1:

install_percona_nodes.sh -p -d -i <server 3 ip address>,<server 4 ip address>,<server 5 ip address> -u sstuser -w passw0rd

Cisco advises that you provide the IP addresses instead of hostnames for servers running the Percona Cluster in order to reduce network latency. The “sstuser“, in the command above, is the user that will be used by the Percona nodes to communicate with each other. The script performs the following tasks:

· Disables SELinux and sets the vm.swappiness property to 1.

· Installs the Percona Yum repository.

· Installs the Percona compatibility package.

· Installs Percona XtraDB cluster.

· Installs the Extended Internet Service Daemon (xinetd).

· Creates a my.cnf configuration file based on the server's hardware.

· Configures a mysqlchk service on port 9198 and restarts the xinetd service.

· Starts the first Percona node in bootstrap mode.

· Reconfigures my.cnf to ensure the node will restart in non-bootstrap mode.

Initialize the Cisco Crosswork Situation Manager database

On server 3, run the following commands to create the Cisco Crosswork Situation Manager databases (moogdb, moog_reference, historic_moogdb, moog_intdb), and populate them with the required schema:

$MOOGSOFT_HOME/bin/utils/moog_init_db.sh -qIu root --accept-eula <<-EOF

EOF

Note:

You do not need to run this command on any of the other nodes. The new schema is replicated automatically around the cluster.

Set up Percona on DB 2

On server 4, run install_percona_nodes.sh for DB 2. The script will perform the same actions, only this time starting the second Percona node to join the first node as a cluster:

install_percona_nodes.sh -d -i <server 3 ip address>,<server 4 ip address>,<server 5 ip address> -u sstuser -w passw0rd

Set up Percona on DB 3

On server 5, run install_percona_nodes.sh as you did for DB 1 and DB 2. The script will perform the same actions, only this time starting the third Percona node to join the first and second nodes as a cluster.

install_percona_nodes.sh -d -i <server 3 ip address>,<server 4 ip address>,<server 5 ip address> -u sstuser -w passw0rd

Verify Percona cluster status

To verify the replication status of each node, run the following commands from a remote server:

curl http://<server3 ip address/hostname>:9198
curl http://<server4 ip address/hostname>:9198
curl http://<server5 ip address/hostname>:9198

A successful response is shown below:

[root@ldev01]# curl -v http://server3:9198
* About to connect() to ldev03 port 9198 (#0)
* Trying 10.99.1.24...
* Connected to server3 (10.99.1.24) port 9198 (#0)
> GET / HTTP/1.1
> User-Agent: curl/7.29.0
> Host: ldev03:9198
> Accept: */*
>
< HTTP/1.1 200 OK
< Content-Type: text/plain
< Connection: close
< Content-Length: 40
<
Percona XtraDB Cluster Node is synced.

Set Up HA Proxy for the Database Role

This topic details the installation and configuration of HA Proxy on a Cisco Crosswork Situation Manager server, for connection to a remote Percona XtraDB Cluster.

Percona XtraDB Cluster must be run as a 3-node (minimum) cluster distributed across the database roles.

HA architecture

Before you install and configure HA Proxy, configure PerconaXtraDB as described in Set Up the Database for HA, with the components installed as follows:

Related image, diagram or screenshot

Server 3: DB 1 (Percona Node 1).

Server 4: DB 2 (Percona Node 2).

Server 5: DB 3 (Percona Node 3).

Fully distributed installation

See Distributed HA Installation for a reference diagram and steps to achieve a fully distributed installation.

Minimally distributed installation

For a minimally distributed installation follow the instructions below, replacing server 3, server 4 and server 5 with the relevant values for your architecture.

Configure HA Proxy for connection to Percona XtraDB Cluster

On the relevant server, run the following command to configure and start HA Proxy:

$MOOGSOFT_HOME/bin/utils/haproxy_installer.sh -l 3307 -c -i <server 3 ip address>:3306,<server 4 ip address>:3306,<server 5 ip address>:3306

Note:

Cisco advises providing IP addresses instead of host names for servers running the Percona Cluster, in order to reduce the network latency.

The script performs the following tasks:

· Installs HA Proxy v1.5.

· Configures HA Proxy to listen on 0.0.0.0:3306 and route connections to one of three MySQL back ends (as specified by the -i properties).

Verify HA Proxy connection status

On the required server, run the following command:

$MOOGSOFT_HOME/bin/utils/check_haproxy_connections.sh

A successful response is as follows:

HAProxy Connection Counts

Frontend:
    0.0.0.0:3306 : 13
Backend:
        mysql_node_1 10.99.1.24:3306 : 13
        mysql_node_2 10.99.1.23:3306 : 0
        mysql_node_3 10.99.1.18:3306 : 0

Press Ctrl-C to quit

Set Up the Core Role for HA

In Cisco Crosswork Situation Manager HA architecture, Core 1 and Core 2 run in an active / passive HA pair.

HA architecture

In our distributed HA installation, the Core components are installed on servers 6, 7, and 8:

Related image, diagram or screenshot

Server 6: Core 1 (Moogfarmd), Elastic Node 1, RabbitMQ Node 1.

Server 7: Core 2 (Moogfarmd), Elastic Node 2, RabbitMQ Node 2.

Server 8: Elastic Node 3, RabbitMQ Node 3.

Fully distributed installation

See Distributed HA Installation for a reference diagram and steps to achieve a fully distributed installation.

Minimally distributed installation

For a minimally distributed installation follow the instructions below, replacing server 6, server 7 and server 8 with the relevant values for your architecture.

Install Core 1

1. Initialize RabbitMQ Cluster Node 1 on the Core Primary Server.

On server 6 initialize RabbitMQ. Set the zone:

moog_init_mooms.sh -pz <ZONE>

2. Initialize, configure and start Elasticsearch Cluster Node 1 on the Core Primary Server.

a. Initialize Elasticsearch on server 6:

moog_init_search.sh

b. Edit the /etc/elasticsearch/elasticsearch.yml file, replacing its contents with the following.

Substitute the values as appropriate, for example <server 6 hostname> :

cluster.name: aiops
node.name: <server 6 hostname>
network.host: 0.0.0.0
http.port: 9200
discovery.zen.ping.unicast.hosts: [ "<server 6 hostname>","<server 7 hostname>","<server 8 hostname>" ]
discovery.zen.minimum_master_nodes: 1
gateway.recover_after_nodes: 1
node.master: true

3. Configure system.conf on the Core Primary server.

a. On server 6 edit the $MOOGSOFT_HOME/config/system.conf file and set the mooms.zone and mooms.brokers properties as follows:

      "zone" : "<ZONE>",
"brokers" : [ { "host" : "<server 6 hostname>", "port" : 5672 },
   { "host" : "<server 7 hostname>", "port" : 5672 },
   { "host" : "<server 8 hostname>", "port" : 5672 }
],

a. In the same file, edit the search.nodes property as follows:

"nodes" : [ { "host" : "<server 6 hostname>", "port" : 9200 },
{ "host" : "<server 7 hostname>", "port" : 9200 },
{ "host" : "<server 8 hostname>", "port" : 9200 }
]

Note:

The brokers object must list all hostnames that form the RabbitMQ cluster and nodes must list the hostnames that form the Elasticsearch cluster.

c. In the same file, set the ha section as follows:

"ha": { "cluster": "PRIMARY" },

d. Restart Elasticsearch:

systemctl restart elasticsearch

4. Install and configure HA Proxy on the Core Primary server for connection to Percona XtraDB Cluster.

On server 6 install, configure and start HAProxy.

Install Core 2

1. Install Cisco Crosswork Situation Manager components on the Core Secondary server.

On server 7 install the following Cisco Crosswork Situation Manager components:

     yum -y install moogsoft-common-7.3* \
    moogsoft-mooms-7.3* \
    moogsoft-search-7.3* \
    moogsoft-server-7.3* \
    moogsoft-utils-7.3* \
    moogsoft-integrations-7.3*

2. Initialize RabbitMQ Cluster Node 2 on the Core Secondary server and create the cluster.

a. On server 7 initialize RabbitMQ. Set a zone name:

moog_init_moos.sh -pz <ZONE>

b. Copy and replace the /var/lib/rabbitmq/.erlang.cookie file from server 6 to the same location on this server.

c. Restart the rabbitmq-server service:

systemctl restart rabbitmq-server

d. Run the following commands to create the cluster:

rabbitmqctl stop_app
rabbitmqctl join_cluster rabbit@<server 6 hostname>
rabbitmqctl start_app

e. Apply HA mirrored queues policy:

rabbitmqctl set_policy -p <ZONE> ha-all ".+\.HA" '{"ha-mode":"all"}'

Note:

Replace <ZONE> with the zone name you used earlier.

f. Verify the cluster status and queue policy. For example:

rabbitmqctl cluster_status
Cluster status of node rabbit@ldev02 ...
[{nodes,[{disc,[rabbit@ldev01,rabbit@ldev02]}]},
{running_nodes,[rabbit@ldev01,rabbit@ldev02]},
{cluster_name,<<"rabbit@ldev02">>},
{partitions,[]},
{alarms,[{rabbit@ldev01,[]},{rabbit@ldev02,[]}]}]
[root@ldev02 rabbitmq]# rabbitmqctl -p <ZONE> list_policies
Listing policies for vhost "MOOG" ...
<ZONE> ha-all .+\.HA all {"ha-mode":"all"} 0

3. Initialize, configure and start Elasticsearch Cluster Node 2 on the Core Secondary server.

a. On server 7 initialize Elasticsearch:

moog_init_search.sh

b. Edit the /etc/elasticsearch/elasticsearch.yml file, replacing its contents with the following:

cluster.name: aiops
node.name: <server 7 hostname>
network.host: 0.0.0.0
http.port: 9200
discovery.zen.ping.unicast.hosts: [ "<server 6 hostname>","<server 7 hostname>","<server 8 hostname>" ]
discovery.zen.minimum_master_nodes: 1
gateway.recover_after_nodes: 1
node.master: true

4. Configure system.conf on the Core Secondary server.

a. On server 7 edit the $MOOGSOFT_HOME/config/system.conf and set the mooms.zone and mooms.brokers properties as follows:

      "zone" : "<ZONE>",
"brokers" : [ { "host" : "<server 6 hostname>", "port" : 5672 },
   { "host" : "<server 7 hostname>", "port" : 5672 },
   { "host" : "<server 8 hostname>", "port" : 5672 }
],

b. In the same file, edit the search.nodes property as follows:

      "nodes" : [ { "host" : "<server 6 hostname>", "port" : 9200 },
   { "host" : "<server 7 hostname>", "port" : 9200 },
   { "host" : "<server 8 hostname>", "port" : 9200 }
]

Note:

The brokers object must list all hostnames that form the RabbitMQ cluster and nodes must list the hostnames that form the Elasticsearch cluster.

c. In the same file, set the ha section as follows:

"ha": { "cluster": "SECONDARY" }

d. Restart Elasticsearch:

systemctl restart elasticsearch

On server 7 edit the $MOOGSOFT_HOME/config/system.conf and set the mooms.zone and mooms.brokers properties as follows:

and the search.nodes property as follows:

5. Install and configure HA Proxy on the Core Secondary server for connection to Percona XtraDB Cluster.

On server 7 install, configure and start HAProxy.

Enable failover

1. Stop the Moogfarmd service on both Primary, server 6 and Secondary, server 7:

systemctl stop moogfarmd

2. On server 6 and server 7 edit the $MOOGSOFT_HOME/config/system.conf file and set the failover.automatic_failover:

"automatic_failover" : true,

3. Start the Moogfarmd service on both Primary, server 6 and Secondary, server 7:

systemctl start moogfarmd

Set Up the User Interface Role for HA

The UI role includes the Nginx and Apache Tomcat components. There are also a number of Cisco Crosswork Situation Manager webapps (servlets) installed and running within Tomcat, responsible for the following processes:

· graze: Graze API

· moogpoller: Dynamic updates to UI

· moogsvr: Services HTTP requests

· situation_similarity: Calculates the situation similarity and pushes to UI

· toolrunner: Services Server Tools

HA architecture

In our distributed HA installation, the UI components are installed on servers 1, 2, 6, 7 and 8.

Related image, diagram or screenshot

• Server 1: UI 1

• Server 2: UI 2

· Server 6: Elasticsearch Node 1

· Server 7: Elasticsearch Node 2

· Server 8: Elasticsearch Node 3

· Server 6: RabbitMQ Node 1

· Server 7: RabbitMQ Node 2

· Server 8: RabbitMQ Node 3

Fully distributed installation

See Distributed HA Installation for a reference diagram and steps to achieve a fully distributed installation.

Minimally distributed installation

For a minimally distributed installation follow the instructions below, replacing servers 1, 2, 6, 7 and 8 with the relevant values for your architecture.

Install UI primary

1. Install Cisco Crosswork Situation Manager components on the UI Primary server.

On server 1 install the following Cisco Crosswork Situation Manager components:

     yum -y install moogsoft-common-7.3* \
    moogsoft-integrations-ui-7.3* \
    moogsoft-ui-7.3* \
    mooogsoft-utils-7.3*

2. On server 1 install, configure and start HA Proxy.

3. Initialize the UI stack:

moog_init_ui.sh -twfz <ZONE> -c <server 6 hostname>:15672 -m <server 6 hostname>:5672 -s <server 6 hostname>:9200 -n

Note:

<server 6 hostname> is the server where the Core Primary is configured.

4. On server 1 edit the $MOOGSOFT_HOME/config/system.conf file and set the mooms.zone and mooms.brokers properties as follows:

     "zone" : "<ZONE>",
"brokers" : [ { "host" : "<server 6 hostname>", "port" : 5672 },
    { "host" : "<server 7 hostname>", "port" : 5672 },
    { "host" : "<server 8 hostname>", "port" : 5672 }
],

5. In the same file, edit the search.nodes property as follows:

     "nodes" : [ { "host" : "<server 6 hostname>", "port" : 9200 },
    { "host" : "<server 7 hostname>", "port" : 9200 },
    { "host" : "<server 8 hostname>", "port" : 9200 }
]

Note: The brokers object must list all hostnames that form the RabbitMQ cluster and nodes must list the hostnames that form the Elasticsearch cluster.

6. In the same file, set the ha section as follows:

"ha": { "cluster": "PRIMARY" },

7. Edit servlets.conf.

On server 1, edit the $MOOGSOFT_HOME/config/servlets.conf file and uncomment the ha section at the bottom of the file. Set the content of the ha section as follows (note the leading comma):

     ,ha :
{
    cluster: "PRIMARY",
    group: "ui1",
    instance: "primary",
    start_as_passive: false
}

8. On server 1, restart the Apache Tomcat service:

systemctl restart apache-tomcat

Install UI secondary

1. Install Cisco Crosswork Situation Manager components on the UI Secondary server.

On server 2 install the following Cisco Crosswork Situation Manager components:

     yum -y install moogsoft-common-7.3* \
    moogsoft-integrations-ui-7.3* \
    moogsoft-ui-7.3* \
    mooogsoft-utils-7.3*

2. Install and configure HA Proxy on the UI Secondary server for connection to Percona XtraDB Cluster.

On server 2 install, configure and start HAProxy.

3. Initialize the UI stack.

On server 2, run the following command to initialize the UI stack:

moog_init_ui.sh -twfz <ZONE> -c <server 7 hostname>:15672 -m <server 7 hostname>:5672 -s <server 7 hostname>:9200 -n

Note:

<server 7 hostname> is the server where the Core Secondary is set up.

4. Reconfigure system.conf on the UI Secondary server.

On server 2 edit the $MOOGSOFT_HOME/config/system.conf file and set the mooms.zone and mooms.brokers properties as follows:

     "zone" : "<ZONE>",
"brokers" : [ { "host" : "<server 7 hostname>", "port" : 5672 },
    { "host" : "<server 6 hostname>", "port" : 5672 },
    { "host" : "<server 8 hostname>", "port" : 5672 }
],

5. In the same file, edit the search.nodes property as follows:

     "nodes" : [ { "host" : "<server 7 hostname>", "port" : 9200 },
    { "host" : "<server 6 hostname>", "port" : 9200 },
    { "host" : "<server 8 hostname>", "port" : 9200 }
]

6. In the same file, se the ha section as follows:

"ha": { "cluster": "SECONDARY" },

7. Edit servlets.conf.

On server 2, edit the $MOOGSOFT_HOME/config/servlets.conf file. Uncomment the ha section at the bottom of the file and set the content as follows (note the leading comma):

     ,ha :
{
    cluster: "SECONDARY",
    group: "ui2",
    instance: "secondary",
    start_as_passive: false
}

8. On server 2, restart the Apache Tomcat service:

systemctl restart apache-tomcat

Configure the UI load balancer

A user session needs to be served from the same UI stack, ie. they need to stay connected to the same UI server for the duration of their session, or until that UI server becomes unavailable (in which case the load balancer will redirect the user to the secondary). This is because requests are routed via moogsvr and data is received from moogpoller (web sockets).

Configure the UI load balancer with the following attributes:

· Since both UI stacks are active you can choose to implement the round robin or least connection balancing method.

· Route web traffic only to the Nginx behind which there is an active UI. The decision for this is based on a moogsvr servlet check via the ‘hastatus’ Tomcat endpoint. It will return a 204 if the UI stack is UP. It does not however report on the health of other roles, ie. Core (Moogfarmd, RabbitMQ and Elasticsearch clusters), Database (Percona Cluster), LAMs.

· Sticky sessions are preferred. Traffic needs to be routed to the same backend server based on the same MOOGSESS cookie.

You can send the following example cURL command from the command line to check moogsvr servlet status:

curl -k https://server1/moogsvr/hastatus -v

Set Up NFS Shared Storage

When you upload files to the Cisco Crosswork Situation Manager UI, it writes them to the disk on the UI Server. For example, Situation Room thread entry attachments and User Profile Pictures.

In an HA configuration running multiple UI roles, the files reside on the disk for the UI server where the user is connected. To ensure attachments are available to users on any UI servers as part of an HA/Load Balancer setup, configure the location of these attachments on a shared disk (NFS) available to all UI servers.

The following example below demonstrates a sample configuration.

· The configuration below assume that /mnt/nfs/shared is the shared location exposed by the NFS Server called NFS_Server

· Ensure that the moogsoft:moogsoft user/group has the same uid/gid across all 3 servers and has ownership of the shared directory. If not then the following commands can be run on all 3 servers to ensure that they are the same. The gid can be be verified on all 3 servers via: cat /etc/group | grep moogsoft command

groupmod -g <gid> moogsoft
usermod -u <gid> moogsoft

NFS Server

Run the following commands, replacing <ServerA> and <ServerE> with the hostnames of the servers

service apache-tomcat stop

chown -R moogsoft:moogsoft /usr/share/apache-tomcat

chown -R moogsoft:moogsoft /var/run/apache-tomcat

mkdir /shared

chown moogsoft:moogsoft /shared/

chmod 755 /shared/

yum install -y nfs-utils nfs-utils-lib

chkconfig nfs on

service rpcbind start

service nfs start

service apache-tomcat start

Edit /etc/exports and set: /shared <ServerA>(rw,sync,no_subtree_check,insecure) <ServerE>(rw,sync,no_subtree_check,insecure)

Then run: exportfs -a

Server 1

Run the following commands to configure Server 1:

service apache-tomcat stop

chown -R moogsoft:moogsoft /usr/share/apache-tomcat

chown -R moogsoft:moogsoft /var/run/apache-tomcat

yum install -y nfs-utils nfs-utils-lib

mkdir -p /mnt/nfs/shared

mount NFS_Server:/shared /mnt/nfs/shared

Edit $MOOGSOFT_HOME/config/servlets.conf and set cache_root: "/mnt/nfs/shared",

service apache-tomcat start

Server 2

Run the following commands to configure Server 2:

Set Up the Redundancy Server Role

In Cisco Crosswork Situation Manager HA architecture, both RabbitMQ and ElasticSearch run as three-node clusters. The three-node clusters prevent issues with ambiguous data state, such as a "split-brain".

RabbitMQ is the Message Bus used by Cisco Crosswork Situation Manager. Elasticsearch delivers the search functionality.

The three nodes are distributed across the two Core roles and the redundancy server.

HA architecture

In our distributed HA installation, the RabbitMQ and Elasticsearch components are installed on servers 6, 7 and 8.

Related image, diagram or screenshot

· Server 6: RabbitMQ Node 1 (part of Core 1)

· Server 7: RabbitMQ Node 2 (part of Core 2)

· Server 8: RabbitMQ Node 3 (part of Redundancy Server)

· Server 6: Elasticsearch Node 1 (part of Core 1)

· Server 7: Elasticsearch Node 2 (part of Core 2)

· Server 8: Elasticsearch Node 3 (part of Redundancy Server)

Fully distributed installation

See Distributed HA Installation for a reference diagram and steps to achieve a fully distributed installation.

Minimally distributed installation

For a minimally distributed installation follow the same instructions below, replacing server 8 with the relevant value for your architecture.

Install Redundancy server

1. Install the Cisco Crosswork Situation Manager components on the Redundancy server.

On server 8 install the following Cisco Crosswork Situation Manager components:

     yum -y install moogsoft-common-7.3* \
    moogsoft-mooms-7.3* \
    moogsoft-search-7.3* \
    moogsoft-utils-7.3*

2. Initialize RabbitMQ Cluster Node 3 on the Redundancy server and join the cluster.

a. On server 8 initialise RabbitMQ. Set a zone name:

moog_init_mooms.sh -pz <ZONE>

b. Copy and replace the /var/lib/rabbitmq/.erlang.cookie file from server 6 to the same location on this server.

c. Restart the rabbit-mq server service:

systemctl restart rabbitmq-server

d. Run the following commands to form the cluster:

rabbitmqctl stop_app
rabbitmqctl join_cluster rabbit@<server 6 hostname>
rabbitmqctl start_app

e. Apply the HA mirrored queues policy:

rabbitmqctl set_policy -p <ZONE> ha-all ".+\.HA" '{"ha-mode":"all"}'

Note:

Replace <ZONE> with the zone name you used earlier.

f. Verify the cluster status and queue policy. For example:

cluster_status

Cluster status of node rabbit@ldev02 ...
[{nodes,[{disc,[rabbit@ldev01,rabbit@ldev02]}]},
{running_nodes,[rabbit@ldev01,rabbit@ldev02]},
{cluster_name,<<"rabbit@ldev02">>},
{partitions,[]},
{alarms,[{rabbit@ldev01,[]},{rabbit@ldev02,[]}]}]
[root@ldev02 rabbitmq]# rabbitmqctl -p MOOG list_policies
Listing policies for vhost "MOOG" ...
MOOG ha-all .+\.HA all {"ha-mode":"all"} 0

3. Initialise, configure and start Elasticsearch Cluster Node 3 on the Redundancy server.

a. On server 8 initialize Elasticsearch:

moog_init_search.sh

b. Edit the /etc/elasticsearch/elasticsearch.yml file, replacing its contents with the following:

cluster.name: aiops
node.name: <server 8 hostname>
network.host: 0.0.0.0
http.port: 9200
discovery.zen.ping.unicast.hosts: [ "<server 6 hostname>","<server 7 hostname>","<server 8 hostname>" ]
discovery.zen.minimum_master_nodes: 1
gateway.recover_after_nodes: 1
node.master: true

c. Restart Elasticsearch:

systemctl restart elasticsearch

Install without Caching LAM

In HA architecture, LAM 1 and LAM 2 run in an active / passive mode for a HA polling pair, and in active / active mode for a HA receiving pair.

HA architecture

In our distributed HA installation, the LAM components are installed on servers 6, 7, 8, 9 and 10:

Related image, diagram or screenshot

· LAM 1: Server 9

· LAM 2: Server 10

· RabbitMQ Node 1: Server 6

· RabbitMQ Node 2: Server 7

· RabbitMQ Node 3: Server 8

Fully distributed installation

See Distributed HA Installation for a reference diagram and steps to achieve a fully distributed installation.

Minimally distributed installation

For a minimally distributed installation follow the instructions below, replacing server 6, 7, 8, 9 and 10 with the relevant values for your architecture.

Install LAM 1

1. Install Cisco Crosswork Situation Manager components on the LAM 1 server.

On server 9 install the following Cisco Crosswork Situation Manager components:

     yum -y install moogsoft-common-7.3* \
    moogsoft-integrations-7.3* \
    moogsoft-utils-7.3*

2. Configure system.conf on the LAM 1 server.

a. On server 9 edit the $MOOGSOFT_HOME/config/system.conf file and set the mooms.zone and mooms.brokers properties as follows:

      "zone" : "<ZONE>",
"brokers" : [ { "host" : "<server 6 hostname>", "port" : 5672 },
    { "host" : "<server 7 hostname>", "port" : 5672 },
    { "host" : "<server 8 hostname>", "port" : 5672 }
],

b. In the same file, set the ha section as follows:

"ha": { "cluster": "PRIMARY" },

3. Install and configure HA Proxy on the LAM 1 server for connection to Percona XtraDB Cluster.

On server 9 install, configure and start HAProxy.

Install LAM 2

1. Install Cisco Crosswork Situation Manager components on the LAM 2 server.

On server 10 install the following Cisco Crosswork Situation Manager components:

     yum -y install moogsoft-common-7.3* \
    moogsoft-integrations-7.3* \
    moogsoft-utils-7.3*

2. Configure system.conf on the LAM 2 server.

a. On server 10 edit the $MOOGSOFT_HOME/config/system.conf file and set the mooms.zone and mooms.brokers properties as follows:

      "zone" : "<ZONE>",
"brokers" : [ { "host" : "<server 7 hostname>", "port" : 5672 },
    { "host" : "<server 6 hostname>", "port" : 5672 },
    { "host" : "<server 8 hostname>", "port" : 5672 }
],

b. In the same file, set the ha section as follows:

"ha": { "cluster": "SECONDARY" },

3. Install and configure HA Proxy on the LAM 2 server for connection to Percona XtraDB Cluster.

On server 7 install, configure and start HAProxy.

Configure a new backend LAM integration as HA on LAM 1 and LAM 2

Folllow the instructions in Set Up LAMs for HA.

Install with Caching LAM

In HA architecture, LAM 1 and LAM 2 run in an active / passive mode for a HA polling pair, and in active / active mode for a HA receiving pair.

Your backend LAM integrations connect to a local two-node RabbitMQ cluster; receiving LAM pairs additionally connect to a local MySQL two-node cluster. The Caching LAM does not require any outbound connectivity; remotely configuring inbound connectivity allows you to connect to the local RabbitMQ cluster to fetch messages from the bus.

HA architecture

In our distributed HA installation, the LAM components are installed on servers 9 and 10:

· LAM 1: Server 9

· LAM 2: Server 10

· Local RabbitMQ Node 1: Server 9

· Local RabbitMQ Node 2: Server 10

· Local MySQL Node 1: Server 9

· Local MySQL Node 2: Server 10

Fully distributed installation

See Distributed HA Installation for a reference diagram and steps to achieve a fully distributed installation.

Minimally distributed installation

For a minimally distributed installation follow the instructions below, replacing server 9 and 10 with the relevant values for your architecture.

Install LAM 1

1. Install Cisco Crosswork Situation Manager components on the LAM 1 server.

On server 9 install the following Cisco Crosswork Situation Manager components:

     yum -y install moogsoft-common-7.3* \
    moogsoft-db-7.3* \
    moogsoft-mooms-7.3*
  moogsoft-integrations-7.3* \
    moogsoft-utils-7.3*

2. Initialize the local Cisco Crosswork Situation Manager RabbitMQ cluster node on the LAM 1 server.

On server 9 initialize RabbitMQ:

moog_init_mooms.sh -pz <ZONE>

Note:

For zone pick a value that is different from the one chosen for the main RabbitMQ cluster.

3. Initialize the Cisco Crosswork Situation Manager database.

On server 9, run the following commands to create the Cisco Crosswork Situation Manager databases and populate them with the required schema:

moog_init_db.sh -Iu root

4. Configure system.conf on the LAM 1 server.

a. On server 9 edit the $MOOGSOFT_HOME/config/system.conf file and set the mooms.zone and mooms.brokers properties with the following. Substitute the values as appropriate, for example <server 9 hostname>:

      "zone" : "<ZONE>",
"brokers" : [ { "host" : "<server 9 hostname>", "port" : 5672 },
              { "host" : "<server 10 hostname>", "port" : 5672 }
            ],

b. In the same file, set the ha section as follows:

"ha": { "cluster": "PRIMARY" },

Install LAM 2

1. Install Cisco Crosswork Situation Manager components on the LAM 2 server.

On server 10 install the following Cisco Crosswork Situation Manager components:

     yum -y install moogsoft-common-7.3* \
    moogsoft-db-7.3* \
    moogsoft-mooms-7.3* \
    moogsoft-integrations-7.3* \
    moogsoft-utils-7.3*

2. Initialize the local RabbitMQ cluster node 1 on the LAM 2 server.

a. On server 10 initialize RabbitMQ:

moog_init_mooms.sh -pz <ZONE>

Note:

For zone pick a value that is different from the one chosen for the main RabbitMQ cluster.

b. Copy and replace the /var/lib/rabbitmq/.erlang.cookiefile from server 9 to the same location on this server.

c. Restart the rabbitmq-server service:

systemctl restart rabbitmq-server

d. Run the following commands to form the cluster:

rabbitmqctl stop_app
rabbitmqctl join_cluster rabbit@<server 9 hostname>
rabbitmqctl start_app

e. Apply the HA mirrored queues policy:

rabbitmqctl set_policy -p <ZONE> ha-all ".+\.HA" '{"ha-mode":"all"}'

Note:

Replace <ZONE> with the zone name you used earlier.

You can then verify cluster status and zone policy:

[root@ldev02 rabbitmq]# rabbitmqctl cluster_status
Cluster status of node rabbit@ldev02 ...
[{nodes,[{disc,[rabbit@ldev01,rabbit@ldev02]}]},
{running_nodes,[rabbit@ldev01,rabbit@ldev02]},
{cluster_name,<<"rabbit@ldev02">>},
{partitions,[]},
{alarms,[{rabbit@ldev01,[]},{rabbit@ldev02,[]}]}]
[root@ldev02 rabbitmq]# rabbitmqctl -p MOOG list_policies
Listing policies for vhost "MOOG" ...
MOOG ha-all .+\.HA all {"ha-mode":"all"} 0

3. Initialize the Cisco Crosswork Situation Manager database for polling LAMs on the LAM 2 server.

On server 10, run the following commands to create the Cisco Crosswork Situation Manager databases, and populate them with the required schema:

moog_init_db.sh -Iu root

4. Configure system.conf on the LAM 2 server.

On server 10, edit the $MOOGSOFT_HOME/config/system.config and set the mooms.zone and mooms.brokers properties as follows. Substitute the values as appropriate, for example <server 9 hostname>:

     "zone" : "<ZONE>",
"brokers" : [ { "host" : "<server 9 hostname>", "port" : 5672 },
              { "host" : "<server 10 hostname>", "port" : 5672 }
            ],

For polling LAMs you must also populate mysql.failover_connections with the following:

     "mysql" :
        {
            "host"                 : "<server 9 hostname>",
            "failover_connections" :
                [
                  { "host" : "<server 10 hostname>", "port" : 3306 }
                ]

In the same file, set the ha section as follows:

"ha": { "cluster": "SECONDARY" },

5. Set up Master-Master replication for the database for polling LAMs.

Set up a backend LAM HA configuration on LAM 1 and LAM 2

See Set Up LAMs for HA for instructions.

Caching LAM

In our HA architecture, Caching LAM 1 and 2 run in an active / passive HA pair. The LAM components are installed on servers 6 ,7, 9 and 10:

· Core 1: Server 6

· Core 2: Server 7

· Caching LAM 1: Server 6

· Caching LAM 2: Server 7

· Local RabbitMQ Node 1 on LAM 1: Server 9

· Local RabbitMQ Node 2 on LAM 2: Server 10

Install Caching LAM 1 and LAM 2

1. Make a copy of the corresponding Config LAM file and rename it accordingly. The default config file is $MOOGSOFT_HOME/config/mooms_cache_lam.conf.

2. Edit the config file to populate the brokers and virtual_host properties:

     brokers:
[
    { host : "<server 9 hostname>", port : 5672 },
    { host : "<server 10 hostname>", port : 5672 }
    }
],
virtual_host : "<local RabbitMQ ZONE name>",

3. Configure the ha section of the same file according to the type of LAM and its corresponding HA setup.

4. Create the service script pointing to the file.

Set Up LAMs for HA

To configure a new backend LAM integration for HA on LAM 1 and LAM 2:

1. Make a copy of the corresponding LAM configuration file and rename it accordingly.

2. Make a copy of the corresponding LAMbot LAM file and rename it accordingly.

3. If applicable, amend the LAM configuration file to point to the LAMBot file (under the Presend section).

4. Create the service script pointing to the configuration file.

5. Configure the ha section of the configuration file according to the type of LAM and its corresponding HA setup.

Configure a Polling LAM for HA

To configure a polling LAM for HA, you must set the LAMs as active / passive, and therefore in the same Cisco Crosswork Situation Manager process group. If the system detects an issue with the active LAM, the passive instance will automatically take over.

Related image, diagram or screenshot

To enable automatic failover:

1. On LAM 1 and LAM 2, edit the $MOOGSOFT_HOME/config/system.conf file and set the automatic_failover property to true:

# Allow a passive process to automatically become active if
# no other active processes are detected in the same process group
"automatic_failover" : true,

2. Restart the polling LAMs to finish enabling automatic failover.

Configure a Receiving LAM for HA

For a HA configuration, the receiving LAMs must always run as active / active, meaning a load balancer (of your choice) places them in different Cisco Crosswork Situation Manager process groups.

Related image, diagram or screenshot

There are two methods you can use to implement your load balancer: chained failover or multiplexing (which sends to both active receiving LAMs.

If you choose to implement using multiplexing, ensure the following:

· The duplicate_event_source parameter in the LAM config is set to true. The parameter lets Moogfarmd know to silently drop any event duplicates arriving within a configurable period.

· The configuration files for both active Receiving LAMs, running as an HA pair, are identical, apart from their ha sections. This ensures that Moogfarmd is able to detect the event duplicates correctly.

The following example cURL command is a call from the command line to check on the status of the LAM instance:

[root@server1 moogsoft]# curl -X GET "http://server9:8888"
{"success":true,"message":"Instance is active","statusCode":0}

Validate the installation

Follow the steps below to validate that the installation was successful.

Elasticsearch requires Java11. Java 11 is included with the installation of the RPM packages as a dependency.

If Elasticsearch fails to start due to an incorrect Java/JDK version, follow these steps.

1. Run the following command to configure the system to use the new Java version:

alternatives --config java

This command prompts you to select which 'java' should be in the system PATH. At the prompt, type the number which corresponds with the Java 11 installation. For example, if the prompt includes:

Selection Command
-----------------------------------------------
*+ 1 java-8-openjdk.x86_64 (/usr/lib/jvm/java-8-openjdk-8.1.0.7-0.el7_6.x86_64/bin/java)
+ 2 java-11-openjdk.x86_64 (/usr/lib/jvm/java-11-openjdk-11.0.2.7-0.el7_6.x86_64/bin/java

Press 2 and hit Enter. To confirm the change has taken effect, run the following command:

java -version

The output should show openjdk version "11.0.2" 2019-01-15 LTS.

2. Restart Elasticsearch:

service elasticsearch restart

Perform the following steps to ensure that Cisco Crosswork Situation Manager v7.3.x has been successfully installed or upgraded:

1. Check that the UI login page displays "Version 7.3.x" at the bottom.

2. Log into the UI and click the Help icon (question mark) > Support Information. Check that the System Information shows version 7.3.x and the correct schema upgrade history if you have performed an upgrade.

Note:

If you have already completed this step previously (as part of this upgrade process) on the current host, you can skip this step.

Run the Install Validator utility to ensure that all Cisco Crosswork Situation Manager files were deployed correctly in $MOOGSOFT_HOME:

$MOOGSOFT_HOME/bin/utils/moog_install_validator.sh

Run this utility to confirm that all Apache Tomcat files were deployed correctly in $MOOGSOFT_HOME:

$MOOGSOFT_HOME/bin/utils/tomcat_install_validator.sh

If there are webapp differences, run the following command to extract the webapps with the correct files:

$MOOGSOFT_HOME/bin/utils/moog_init_ui.sh -w

Note:

If you have already completed this step previously (as part of this upgrade process) on the current host, you can skip this step.

Run the Database Validator utility to validate the database schema:

$MOOGSOFT_HOME/bin/utils/moog_db_validator.sh

Note:

Some schema differences are valid, for example those related to custom_info (new columns added etc).

An additional required schema upgrade step is documented on the Post-upgrade steps page. Until this has been run, you should expect to see the following differences in the output of the Database Validator utility:

Differences found in 'historic_moogdb' tables:
41,49c41,43
<   primary key (`alert_id`),
<   unique key `idx_signature` (`signature`),
<   key `idx_first_event_time` (`first_event_time`),
< key `idx_state_last` (`state`,`last_state_change`),
<   key `idx_severity` (`severity`,`state`),
<   key `idx_agent` (`agent`(12)),
<   key `idx_source` (`source`(12)),
<   key `idx_type` (`type`(12)),
<   key `idx_manager` (`manager`(12))
---
>   primary key (`signature`),
>   key `alert_id` (`alert_id`),
>   key `first_event_time` (`first_event_time`,`alert_id`)
93,94c87
<   key `timestamp` (`timestamp`,`type`),
<   key `idx_type_time` (`type`,`timestamp`)
---
>   key `timestamp` (`timestamp`,`type`)
241,242c234
<   key `sig_id` (`sig_id`,`action_code`,`timestamp`),
<   key `idx_action_sig` (`action_code`,`sig_id`)
---
>   key `sig_id` (`sig_id`,`action_code`,`timestamp`)

The differences above will not have any functional impact, but you must complete the rest of the upgrade to ensure the system is performant and the schema is ready for future upgrades.

If you have performed an upgrade and you see errors similar to the following:

Differences found in 'moogdb' tables:
57a58
> key 'filter_id' ('filter_id'),
194a196
> key 'enrichment_static_mappings_ibfk_1' ('eid'),
1196a1199
> key 'sig_id' ('sig_id'),
1325a1329
> key 'filter_id' ('filter_id'),

Run the following commands to resolve these index-related problems:

mysql moogdb -u root -e "alter table alert_filters_access drop key filter_id"
mysql moogdb -u root -e "alter table situation_filters_access drop key filter_id"
mysql moogdb -u root -e "alter table enrichment_static_mappings drop key enrichment_static_mappings_ibfk_1"
mysql moogdb -u root -e "alter table sig_stats_cache drop key sig_id"

System Setup for Cisco Crosswork Situation Manager

Shortly after you install the Cisco Crosswork Situation Manager packages and the system is running, you can configure additional components per your organizations needs:

· Apply Valid SSL Certificates

· Configure External Authentication

· Configure Search and Indexing

· Configure Services to Restart

· Configure ToolRunner

· Configure SMS Notifications

· Enable Situation Room Plugins

· Import a Topology

If you need to troubleshoot Cisco Crosswork Situation Manager:

· Monitor and Troubleshoot Moogsoft AIOpsMonitor and Troubleshoot Moogsoft AIOps

· Configure Logging

After your system has been running for some time:

· Configure Historic Data Retention

· Archive Situations and Alerts

System Configuration

You can configure the various components of Cisco Crosswork Situation Manager using the system configuration file. These include:

· Message Bus

· Databases

· Search

· Failover

· Process monitoring

· Web host address

· Logging

Configure your System

Edit the configuration file to control the behavior of the different components in your Cisco Crosswork Situation Manager system. You can find the file at $MOOGSOFT_HOME/config/system.conf.

See the System Configuration Reference for a full description of all properties. Some properties in the file are commented out by default. Uncomment properties to configure and enable them.

Message Bus

You can edit your Message Bus and RabbitMQ configuration in the mooms section of the file. It allows you to:

· Configure your Message Bus zones and brokers.

· Control and minimize message loss during a failure.

· Control how senders handle Message Bus failures.

· Control what happens during periods of extended Message Bus unavailability.

· Configure the SSL protocol you want to use.

· Specify the number of connections to use for each message sender pool.

For more information see the Message Bus documentation.

Database

You can edit your database configuration in the mysql section of the file:

1. Configure your host name, database names and database credentials:

— host: Name of your host.

— moogdb_database_name: Name of the Moogdb database.

— referencedb_database_name: Name of the Cisco Crosswork Situation Manager reference database.

— intdb_database_name: Name of the Cisco Crosswork Situation Manager integrations database.

— username:Username for the MySQL user that accesses the database.

— encrypted_password: Encrypted password for the MySQL user.

— password: Password for the MySQL user.

— port: Default port that Cisco Crosswork Situation Manager uses to connect to MySQL.

2. Configure the port, deadlock retry attempts and multi-host connections:

— maxRetries: Maximum number of retries in the event of a MySQL deadlock.

— retryWait: Number of milliseconds to wait between each retry attempt.

— failover_connections: Hosts and ports for the different servers that are connected to the main host.

3. Configure the SSL connections to the MySQL database:

— trustStorePath: Path to location that stores the server certificate.

— trustStoreEncryptedPassword: Path to location that stores your encrypted trustStore password.

— trustStorePassword: Path to location that stores your trustStore password.

Elasticsearch

You can edit your search configuration in the search section of the file:

1. Configure the Elasticsearch connection timeouts:

— connection_timeout: Length of time in milliseconds before the connection times out.

— request_timeout: Length of time in milliseconds before the request times out.

2. Configure the Elasticsearch limit and nodes:

— limit: Maximum number of search results that Elasticsearch returns from a search query.

— nodes: Hosts and ports for the different Elasticsearch servers connected in a cluster.

Failover

You can edit failover configuration in the failover section of the file:

1. Configure persistence in the event of a failover:

a. persist_state: Enable or disable the persistence of the state of all Moolets in the event of a failover.

2. Configure the Hazelcast cluster, this is Cisco Crosswork Situation Manager implementation of persistence:

— network_port: Port to connect to on each specified host.

— auto_increment: Enable for Hazelcast to attempt to the next incremental available port number if the configured port is unavailable.

— hosts: List of hosts that can participate in the cluster.

— man_center: Configures the cluster information that you can view in the Hazelcast Management Center UI.

— cluster_per_group: Enable the stateful information from each process group to persist in a dedicated Hazelcast cluster.

3. Configure failover options that apply to Moogfarmd and the LAMs:

— keepalive_interval: Time interval in seconds at which processes report their active/passive status and check statuses of other processes.

— margin: Amount of time in seconds after keepalive_intervalbefore Cisco Crosswork Situation Manager considers processes that do not report their status to be dead.

— failover_timeout: Number of seconds to wait for previously active process to become passive during a manual failover.

— automatic_failover: Allow a passive process to automatically become active if no other active processes are detected in the same process group.

— heartbeat_failover_after: Number of consecutive heartbeats that a process fails to send before Moogfarmd considers it inactive.

Process Monitor

You can edit the process monitor configuration in the process_monitor section of the file:

1. Configure the heartbeat interval and delay:

— heartbeat: Interval in milliseconds between heartbeats sent by processes.

— max_heartbeat_delay: Number of milliseconds to wait before declaring heartbeat as missing.

2. Configure the Moogfarmd and which processes you can control from the UI:

— group: Name of the group of processes and subcomponent processes that you want to control from the UI.

— instance: Name of the instance of Cisco Crosswork Situation Manager you want to configure.

— service_name: Name of the service you want to control.

— process_type: Type of process you want to control.

— reserved: Determines if Cisco Crosswork Situation Manager considers the process as critical in process monitoring.

Encryption

You can edit the encryption configuration in the encryption section of the file:

— encryption_key_file: Default location of the encryption key file.

High Availability

You can edit the high availability configuration in the ha section of the file.

— cluster: Default HA cluster name.

Service Port Range

You can edit the port range that Cisco Crosswork Situation Manager services use when they look for open ports.

— port_range_min: Minimum port number in the range.

— port_range_max: Maximum port number in the range.

Example

The following example shows system.conf with the default configuration and all available properties enabled:

{
    "mooms": {
        "zone": "",
        "brokers": [{
            "host": "localhost",
            "port": 5672
             }],
         "username": "moogsoft",
         "password": "m00gs0ft",
         "encrypted_password": "e5uO0LY3HQJZCltG/caUnVbxVN4hImm4gIOpb4rwpF4=",
         "threads": 10,
         "message_persistence": false,
         "message_prefetch": 100,
         "max_retries": 100,
         "retry_interval": 200,
         "cache_on_failure": false,
         "cache_ttl": 900,
         "connections_per_producer_pool": 2,
         "confirmation_timeout": 2000,
         "ssl": {
             "ssl_protocol": "TLSv1.2",
             "server_cert_file": "server.pem",
             "client_cert_file": "client.pem",
             "client_key_file": "client.key"
         }
      },
      "mysql": {
          "host": "localhost",
          "moogdb_database_name": "moogdb",
          "referencedb_database_name": "moog_reference",
          "intdb_database_name": "moog_intdb",
          "username": "ermintrude",
          "encrypted_password": "vQj7/yom7e5ensSEb10v2Rb/pgkaPK/4OcUlEjYNtQU=",
          "password": "m00",
          "port": 3306,
          "maxRetries": 10,
          "retryWait": 50,
          "failover_connections": [
           {
               "host": "193.221.20.24",
               "port": 3306
           },
           {
               "host": "143.47.254.88",
               "port": 3306
           },
           {
               "host": "234.118.117.132",
               "port": 3306
           }
         ],
      "ssl": {
          "trustStorePath": "etc/truststore",
          "trustStoreEncryptedPassword": "vQj7/yom7e5ensSEb10v2Rb/pgkaPK/4OcUlEjYNtQU=",
          "trustStorePassword": "moogsoft"
          }
      },
      "search": {
          "connection_timeout": 1000,
          "request_timeout": 10000,
          "limit": 1000,
          "nodes": [{
              "host": "localhost",
              "port": 9200
          }]
      },
      "failover": {
          "persist_state": false,
         "hazelcast": {
              "network_port": 5701,
              "auto_increment": true,
              "hosts": ["localhost"],
              "man_center":
              {
                  "enabled": false,
                  "host": "localhost",
                  "port": 8091
              },
              "cluster_per_group": false
          },
          "keepalive_interval": 5,
          "margin": 10,
          "failover_timeout": 10,
          "automatic_failover": false,
          "heartbeat_failover_after": 2
      },
      "process_monitor": {
          "heartbeat": 10000,
          "max_heartbeat_delay": 1000,
          "processes": [{
              "group": "moog_farmd",
              "instance": "",
              "service_name": "moogfarmd",
              "process_type": "moog_farmd",
              "reserved": true,
              "subcomponents": [
                  "AlertBuilder",
                  "Default Cookbook",
                  "TeamsMgr",
                  "Housekeeper",
                  "AlertRulesEngine",
                  "SituationMgr",
                  "Notifier"
                  ]},
                  {
                      "group": "servlets",
                      "instance": "",
                      "service_name": "apache-tomcat",
                      "process_type": "servlets",
                      "reserved": true,
                      "subcomponents": [
                          "moogsvr",
                          "moogpoller",
                          "toolrunner",
                          "situation_similarity"
                      ]},
                  {
                      "group": "logfile_lam",
                      "instance": "",
                      "service_name": "logfilelamd",
                      "process_type": "LAM",
                      "reserved": false
                  },
                  {
                      "group": "rest_lam",
                      "instance": "",
                      "service_name": "restlamd",
                      "process_type": "LAM",
                      "reserved": false
                  },
                  {
                      "group": "socket_lam",
                      "instance": "",
                      "service_name": "socketlamd",
                      "process_type": "LAM",
                      "reserved": false
                  },
                  {
                      "group": "trapd_lam",
                      "instance": "",
                      "service_name": "trapdlamd",
                      "process_type": "LAM",
                      "reserved": false
                  },
                  {
                      "group": "rest_client_lam",
                      "instance": "",
                      "service_name": "restclientlamd",
                      "process_type": "LAM",
                      "reserved": false
                  }
             ]
        },
        "encryption": {
            "encryption_key_file": "/location/of/.key"
        },
        "ha": {
            "cluster": "MOO"
        },
        "port_range_min": 50000,
        "port_range_max": 51000
}

Start and Stop Moogfarmd

Restart the Moogfarmd service to activate any changes you make to the system configuration file.

The service name is moogfarmd.

See Control Moogsoft AIOps Processes for further details.

System Configuration Reference

This is a reference for the system configuration file located at $MOOGSOFT_HOME/config/system.conf. It contains the following sections and properties:

Message Bus (MooMs)

connections_per_producer_pool

The number of connections to use for each message sender pool. For example, if a message sender pool has 20 channels and this property is set to 2, the channels are split across both connections so that each has 10 channels. To configure this property, you must manually add it to the mooms section.

Type	Integer
Required	No
Default	2

zone

Name of the zone.

Type	String
Required	No
Default	N/A

brokers

Hostname and port number of the RabbitMQ broker.

Type	Array
Required	No
Default	"host" : "localhost", "port" : 5672

username

Username of the RabbitMQ user. This needs to match the RabbitMQ broker configuration. If commented out, it uses the default "guest" user.

Type	String
Required	No
Default	guest

password

Password for the RabbitMQ user. You can choose to either have a password or an encrypted password, you cannot use both.

Type	String
Required	Yes. If you are not using encrypted password.
Default	guest

encrypted_password

Encrypted password for the RabbitMQ user. You can choose to either have a password or an encrypted password, you cannot use both. See Moog Encryptor if you want to encrypt your password.

Type	String
Required	Yes. If you are not using password.
Default	N/A

threads

Number of threads a process can create in order to consume the messages from the Message Bus. If not specified, the thread limit = (Number of processors x 2) + 1. Altering this limit affects the performance of Cisco Crosswork Situation Manager processes such as Moogfarmd and Moogpoller.

If your logs indicate an issue in creating threads, Cisco advises that you increase the ulimit, the maximum number of file descriptors each process can use, for the Cisco Crosswork Situation Manager user. You can set this limit in /etc/security/limits.conf.

Type	Integer
Required	No
Default	10

message_persistance

Controls whether RabbitMQ persists importance messages or not. Message queues are durable by default and data is replicated between nodes in High Availability mode. Setting this value to false, means that replicated data is not stored to disk.

Type	Boolean
Required	No
Default	true

message_prefetch

Controls how many messages a process can take from the Message Bus and store in memory as a buffer for processing. This configuration allows processes to regulate message consumption which can ease backlog and memory consumption issues. The higher the number, the more messages held in the process's memory.

Type	Integer
Required	No
Default	0

max_retries

Maximum number of attempts to resend a message that failed to send. Cisco Crosswork Situation Manager only attempts a retry when there is a network outage or if cache_on_failure is enabled.

You can use this in conjunction with the retry_interval property. For example, a combination of 100 maximum retries and 200 milliseconds for retry interval leads to a total of 20 seconds. The combined default value for these properties was chosen to handle the typical time for a broker failover in a clustered environment.

Type	Integer
Required	No
Default	100

retry_interval

Maximum length of time to wait in milliseconds between each attempt to retry and send a message that failed to send.

You can use this in conjunction with the max_retries property. The combined value for these properties was chosen to handle the typical time for broker failover in a clustered environment.

Type	Integer
Required	No
Default	200

cache_on_failure

Controls whether Cisco Crosswork Situation Manager caches the message internally and resends it if there is an initial retry failure. The system attempts to resend any cached messages in the order they were cached until the time-to-live value, defined by the cache_ttl property, is reached.

Type	Boolean
Required	No
Default	false

cache_ttl

Length of time in seconds that Cisco Crosswork Situation Manager keeps cached messages in the cache list before discarding them. If a message is not successfully resent within this timeframe it is still discarded.

This defaults to 900 seconds (15 minutes). Increasing this value has a direct impact on sender process memory.

Type	Integer
Required	No
Default	900

confirmation_timeout

Length of time in milliseconds to wait for the Message Bus to confirm that a broker has received a message. Cisco does not advise changing this value.

Type	Integer
Required	No
Default	2000

Message Bus SSL

ssl_protocol

SSL protocol you want to use. JRE 8 supports "TLSv1.2", "TLSv1.1", "TLSv1" or "SSLv3".

Type	String
Required	No
Default	TLSv1.2

server_cert_file

Path to the directory that contains the SSL certificates. You can use a relative path based upon the $MOOGSOFT_HOME directory. For example, config indicates $MOOGSOFT_HOME/config.

Type	String
Required	No
Default	server.pem

client_cert_file

Enables client authentication if you provide a client certificate and key file.

Type	String
Required	No
Default	client.pem

client_key_file

Enables client authentication if you provide a client key file. The file must be in PKCS#8 format.

Type	String
Required	No
Default	client.key

MySQL

host

Host name or server name of the server that is running MySQL.

Type	String
Required	No
Default	localhost

moogdb_database_name

Name of the primary Cisco Crosswork Situation Manager database.

Type	String
Required	No
Default	moogdb

referencedb_database_name

Name of the Cisco Crosswork Situation Manager reference database.

Type	String
Required	No
Default	moog_reference

intdb_database_name

Name of the integrations database.

Type	String
Required	No
Default	moog_intdb

username

Username of the MySQL user.

Type	String
Required	No
Default	ermintrude

password

Password for the MySQL user. You can choose to either have a password or an encrypted password, you cannot use both.

Type	String
Required	Yes, if you are not using encrypted password.
Default	m00

encrypted_password

Encrypted password for the MySQL user. You can choose to either have a password or an encrypted password, you cannot use both. See Moog Encryptor if you want to encrypt your password.

Type	String
Required	Yes, if you are not using password.
Default	N/A

port

Port that MySQL uses.

Type	Integer
Required	No
Default	3306

maxRetries

Maximum number of MySQL query retries to attempt in the event of a deadlock.

Type	Integer
Required	No
Default	10

retryWait

Length of time in milliseconds to wait between retry attempts.

Type	Integer
Required	No
Default	50

failover_connections

Hosts and ports for the different servers that are connected to the main host. For example, master-master, master-slave. In the event of connection failover, the connection cannot be read-only (slave).

Type	List
Required	No
Default	N/A

MySQL SSL

trustStorePath

Path to tNohe directory that contains the trustStore you want to use for SSL connections to your MySQL database. You can use a relative path based upon the $MOOGSOFT_HOME directory. For example, config indicates $MOOGSOFT_HOME/config/truststore.

Type	String
Required	No
Default	etc/truststore

trustStoreEncryptedPassword

Your encrypted trustStore password. You can choose to either have a password or an encrypted password, you cannot use both. See Moog Encryptor if you want to encrypt your password.

Type	String
Required	Yes, if you are not using trustStorePassword.
Default	N/A

trustStorePassword

Your trustStore password. You can choose to either have a password or an encrypted password, you cannot use both.

Type	String
Required	No, if you are not using trustStoreEncryptedPassword.
Default	moogsoft

connection_timeout

Length of time in milliseconds before the connection to the Elasticsearch server times out.

Type	Integer
Required	No
Default	1000

nodes

Hosts and ports for the different Elasticsearch servers connected in a cluster.

Type

Array

Required

Default

"host" : "localhost",

"port" : 9200

Failover

persist_state

Enable or disable the persistence of the state of all Moolets in the event of a failover.

Type	Boolean
Required	No
Default	false

network_port

Port to connect to on each specified host in your Hazelcast cluster.

Type	Integer
Required	No
Default	5701

auto_increment

Enable for Hazelcast to attempt to connect to the next incremental available port number if the configured port is unavailable.

Type	Boolean
Required	No
Default	true

hosts

List of hosts that can participate in the cluster.

Type	Array
Required	No
Default	localhost

man_center

Specifies the cluster information that you can view in the Hazelcast Management Center UI.

Type

List

Required

Default

"enabled" : false,

"host" : "localhost",

"port" : 8091

cluster_per_group

Enable the stateful information from each process group to persist in a dedicated Hazelcast cluster.

Type	Boolean
Required	No
Default	false

Moogfarmd Failover

keepalive_internal

Time interval in seconds at which processes report their active or passive status and check statuses of other processes.

Type	Integer
Required	No
Default	5

margin

Amount of time in seconds after keepalive_interval before Cisco Crosswork Situation Manager considers processes that do not re_port their status to be dead.

Type	Integer
Required	No
Default	10

failover_timeout

Amount of time in seconds to wait for previously active process to become passive during manual failover.

Type	Integer
Required	No
Default	10

automatic_failover

Allow a passive process to automatically become active if no other active processes are detected in the same process group.

Type	Boolean
Required	No
Default	false

Process Monitor

heartbeat

Interval in milliseconds between heartbeats sent by processes.

Type	Integer
Required	Yes
Default	10000

max_heartbeat_delay

Number of milliseconds to wait before declaring heartbeat as missing. Defaults to 10% of the heartbeat.

Type	Integer
Required	No
Default	1000

Processes

Groups of processes that you want to be able to stop, start and restart from Self Monitoring in the Cisco Crosswork Situation Manager UI. For each group you can configure the following options:

group

Name of the process group that Cisco Crosswork Situation Manager uses when it starts and stops the service.

Type	String
Required	Yes
Default	N/A

instance

Name of the instance for the process.

Type	String
Required	Yes
Default	N/A

display_name

Additional identification label that appears in the UI.

Type	String
Required	No
Default	N/A

cluster

Name of the process's cluster. This overrides the default cluster for a process. If left empty, the Cisco Crosswork Situation Manager uses the process's default cluster.

Type	String
Required	No
Default	N/A

service_name

Name of the service script that Cisco Crosswork Situation Manager uses to control the process. If you do not configure a service name, Cisco Crosswork Situation Manager uses the group name, removing underscores and appending a 'd'. For example, "traplam" becomes "traplamd".

Type	String
Required	No
Default	N/A

process_type

Type of process. If left empty, Cisco Crosswork Situation Manager calculates the type based on the group name.

Type	String
Required	No
Default	N/A
Valid Values	moog_farmd, servlet, LAM

reserved

Determines if the process produces a warning in the UI when it is running. Processes that are unreserved do not produce a warning.

Type	Boolean
Required	No
Default	true

subcomponents

Specifies which Moolets are reserved for the Moogfarmd process. If left empty, no Moolets are reserved for the Moogfarmd process.

Type	Array
Required	No
Default	N/A

Encryption

encryption_key_file

Default location of the encryption key file.

Type	String
Required	No
Default	/location/of/.key

High Availability (HA)

cluster

Default HA cluster name.

Type	String
Required	No
Default	MOO

Port Range

port_range_min

Minimum port number in the range that the Cisco Crosswork Situation Manager services use when they look for open ports.

Type	String
Required	No
Default	50000

port_range_max

Maximum port number in the range that the Cisco Crosswork Situation Manager services use when they look for open ports.

Type	String
Required	No
Default	51000

Configure Data Ingestion

Integrations and LAMs handle data ingestion from your event sources into Cisco Crosswork Situation Manager.

Many monitoring and ticketing systems can be configured by using an integration in the UI. Go to the Integrations tab to see what is available.

If you want to set properties that are not visible in the integration, or configure for high availability, modify the LAM configuration file instead. For each data source you can configure either the integration or the LAM, not both. A UI integration is independent from a LAM and you cannot edit it outside the UI.

You can find information about specific integrations and LAMs in the Integrations Guide.Integrations

Custom Info

Custom_info fields are customizable fields relating to either an Alert or a Situation that can be added to Cisco Crosswork Situation Manager during configuration.

These will be displayed in the UI as columns in the Alerts and Situations Views and can be configured with optional sorting and filtering.

Note:

Please Note:: Custom_Info commands can be found in the usr/share/moogsoft/bin/utils folder

Adding Custom_Info Fields

The following commands can be used to add either Alert or Situation custom_info fields:

Command	Description
moog_add_alert_custom_field	This adds a new Alert custom_info field
moog_add_sitn_custom_field	This adds a new Situation custom_info field

To configure the display name, the field name and indexing, there are a number of options that can be used:

Option	Description
-d, --display_name <arg>	The display name of the field in the UI
-f, --field <arg>	The custom_info field name
-i, --index	This indicates the field is indexed for filtering and sorting Note Note: This cannot be used with display only fields If you are planning to use this custom_info field in Alert or Situation filters or you are planning to sort using this column we recommend you use the --index option to aid filter loading performance Too many indexed columns may affect the performance of additions
-l, --loglevel <arg>	Specify (INFO\| WARN\| ALL) to select the amount of debug output
-o, --display_only	This indicates the field is for display only and cannot be used to filter, sort or search
-s, --size <arg>	The index size (the number of characters). This is valid for indexed text fields only. The default is 50
-t, --type <arg>	The type of field (number or text). The default is number

The example below shows how to add an alert custom_info text field which is also an indexed so will be filterable:

[root@moogsoft ~]# moog_add_alert_custom_field -d newfield -f new_field -i -t TEXT

Adding Custom Info Example

The addition of the new custom info field is confirmed with a message similar to the following:

Field newfield was added to UI successfully
Filterable field custom_info.new_field was added successfully

Filling Custom Info Fields

There is a utility that allows you to fill the Alerts or Situations filterable custom info fields using retrospective data:

Command	Description
moog_fill_alert_custom_fields	This fills the filterable Alert custom info fields using retrospective data
moog_fill_sitn_custom_fields	This fills the filterable Situation custom info fields using retrospective data

The amount of time the fill utility goes back and the log level can be configured using the following options:

Option

Description

-b, --back <arg>

This defines how far back the fill utility will go back, with 's' for seconds, 'm' for minutes, 'h' for hours, 'd' for days and 'w' for weeks

E.g. -b 2w for two weeks

Note

Please note: You can leave empty for all but this might take some time

-l, --loglevel <arg>

Specify (INFO| WARN| ALL) to choose the amount of debug output

Filling Custom_Info Example

The example below shows how to fill Situation custom info fields with retrospective data from the past three days:

[root@centos7 ~]# moog_fill_sitn_custom_fields -b 3d
Filterable custom info data was filled successfully

Removing Custom_Info Fields

The following commands can be used to remove previously configured Alert or Situation custom info fields:

Command	Description
moog_remove_alert_custom_field	This removes Alert custom info field
moog_remove_sitn_custom_field	This removes a Situation custom info field

After entering the command, type -f and enter the custom info field name to select the field you want to remove.

Removing Custom_Info Example

The example below shows how to remove a custom info field called 'new_field'.

[root@moogsoft ~]# moog_remove_alert_custom_field -f new_field
Field custom_info.new_field was removed successfully

Configure Custom Info Search

You must run a utility if custom info columns are added and existing Alerts or Situations contain values in that column for them to be filterable in the UI. Alert or Situations which are new or updated after the new column has been added will be filterable automatically.

If an alert custom info field has been added, run $MOOGSOFT_HOME/bin/utils/moog_fill_alert_custom_fields.

If a Situation custom info field has been added, run $MOOGSOFT_HOME/bin/utils/moog_fill_sitn_custom_fields.

Data Parsing

Cisco Crosswork Situation Manager divides incoming data into tokens (tokenised) and then assembles the tokens into an event. You can control how tokenising works.

Start and End Characters

The first two are a start and end character. The square brackets [] are the JSON notation for a list. You can have multiple start and end characters. The system considers an event as all of the tokens between any start and end character.

start : [],
end : ["\n"],

The above example specifies:

• There is nothing defined in start; however, a carriage return (new line) is defined as the end character

In the example above, the LAM is expecting an entire line to be written followed by a return, and it will process the entire line as one event.

Carefully set up, you can accept multi-line events.

Regular Expressions

Regular expressions can be used to extract relevant data from the input data. Here's an example definition:

parsing:
{
type: "regexp",
        regexp:
        {
                pattern : "(?m)^START: (.*?)$",
                capture_group: 1,
                tokeniser_type: "delimiters",
                delimiters:
                {
                        ignoreQuotes: true,
                       stripQuotes: true,
                        ignores:    "",
                        delimiter: ["||","\r"]
                }
        }
}

Delimiters

Delimiters define how string are split into tokens for processing. To process a comma-separated file, where a comma separates each value, define the comma as a delimiter.

Token are referenced from the start position starting at one (not zero).

For example, for the input string “the,cat,sat,on,the,mat” where the delimiter is a comma, token 1 is “the”, token 2 “cat” and so on.

Combining tokenization and parsing can be complex. For example, if you use a comma delimiter and the token contains a comma, the token is split into two. To avoid this you can quote strings. You can then define whether to strip or ignore quotes.

An example delimiters section in a configuration file is as follows:

delimiters:
{
        ignoreQuotes    : true,
        stripQuotes             : false,
        ignores                 : "",
        delimiter               : [",","\r"]
}

When ignoreQuotes is set to true, all quotes are ignored and inputs are tokenised on the delimiters only.

When ignoreQuotes is false, delimiting does not occur until the matching end quote is found. This allows tokens to include delimiters. For example, given the following input when the delimiter is a comma:

hello world, "goodbye, cruel world".

Found tokens when ignoreQuotes is true: [hello world, goodbye, cruel world] (3).

Found tokens when ignoreQuotes is false: [hello world, "goodbye, cruel world"] (2).

Set stripQuotes to true to remove start and end quotes from tokens. For example, "hello world" results in a single token: [hello world].

Ignores is a list of characters to ignore. Ignored characters are never included in tokens.

Delimiter is the list of valid delimiters used to split strings into tokens.

Mapping

For each event in the file, there is a positioned collection of tokens. Cisco Crosswork Situation Manager enables you to name these positions so if you have a large number of tokens in a line, of which you are interested in only five or six, instead of remembering it is token number 32, you can call token 32 something meaningful.

variables:
[
        { name: "Identifier", position: 1 },
        { name: "Node", position: 4 },
        { name: "Serial", position: 3 },
        { name: "Manager", position: 6 },
        { name: "AlertGroup", position: 7 },
        { name: "Class", position: 8 },
        { name: "Agent", position: 9 },
        { name: "Severity", position: 5 },
        { name: "Summary", position: 10 },
        { name: "LastOccurrence",position: 1 }
]

The above example specifies:

• position 1 is assigned to Identifier; position 4 is assigned to node and so on

• Positions start at 1, and go up rather than array index style counting from 0

This is important because at the bottom of the file, socket_lam.conf there is a mapping object that configures how Cisco Crosswork Situation Manager assigns to the attributes of the event that is sent to the message bus, values from the tokens that are parsed. For example, in mapping there is a value called rules, which is a list of assignments.

mapping:
{
        catchAll: "overflow",
        rules:
        [
                { name: "signature", rule: "$Node:$Serial" },
        { name: "source_id", rule: "$Node" },
      { name: "external_id", rule: "$Serial" },
        { name: "manager", rule: "$Manager" },
        { name: "source", rule: "$Node" },
        { name: "class", rule: "$Class" },
        { name: "agent", rule: "$LamInstanceName" },
        { name: "agent_location", rule: "$Node" },
        { name: "type", rule: "$AlertGroup" },
        { name: "severity", rule: "$Severity", conversion: "sevConverter" },
        { name: "description", rule: "$Summary" },
        { name: "first_occurred", rule: "$LastOccurrence" ,conversion: "stringToInt"},
        { name: "agent_time", rule: "$LastOccurrence",conversion: "stringToInt"}
        ]
}

In the example above, the first assignment name: "signature",rule:"$Node:$Serial" ( "$Node:$Serial is a string with $ syntax) means for signature take the tokens called Node and Serial and form a string with the value of Node followed by a colon followed by the value of Serial and call that signature in the event that is sent to the Cisco Crosswork Situation Manager.

You define a number of these rules covering the base attributes of an event. For reference, Cisco Crosswork Situation Manager expects a minimum set of attributes in an event that are shown in this particular section.

Using braces within mapping definitions allows you to include URLs and special characters. For example:

mapping:
{
        [
                { name: "type", rule: "${https://url}" },
                { name: "type", rule: "${https://url} customText" },
                { name: "type", rule: "${https://url}${keyA\\b\\c}" }
        ]
}

Escape backslashes (\\) and note that you cannot embed variables.

If you have an attribute that is never referenced in a rule, for example “enterprise trap number” which is never mapped into the attribute of an event, they are collected and placed as a JSON object in a variable defined in catchAll and passed as part of the event.

Custom Info Mapping

You can define custom_info mapping in LAM configuration files. This allows you to configure a hierarchical structure. An example mapping configuration is:

mapping:
{
        rules:
        [
                { name: "custom_info.eventDetails.branch", rule: "$branch" },
                { name: "custom_info.eventDetails.location", rule: "$location" },
                { name: "custom_info.ticketing.id", rule: "$incident_id" }
        ]
}

This produces the following custom_info structure:

"custom_info": {
        "eventDetails": {
                "branch":"Kingston",
                "location":"KT1 1LF"
        },
        "ticketing": {
                "id":94111
        }
}

You can use braces within mapping definitions. This allows you to include URLs and special characters. For example:

{ name: "type", rule: "${https://url}" },
{ name: "type", rule: "${https://url} customText" },
{ name: "type", rule: "${https://url}${keyA.b.c}" }

Note that you must escape backslashes and you cannot embed variables.

Polling LAMs with multiple target support

For information on LAMs with multiple target support, see Polling LAMs With Multiple Target Support.

Filtering

The filter defines whether a LAM uses a LAMbot. A LAMbot moves overflow properties to custom info and performs any actions that are configured in its LAMbot file. The LAMbot processing is defined in the presend property in the filter section of the LAM configuration file.

For example, the SolarWinds LAM configuration file contains this filter section:

filter:
{
modules : ["CommonUtils.js"],
presend : "SolarWindsLam.js"
}

This indicates that SolarWindsLam.js processes the events and then sends them to the Message Bus.

If you don’t want to map overflow properties, you can comment out the presend property to bypass the LAMbot and send events straight to the Message Bus. This speeds up processing if you have a high volume of incoming alerts. Alternatively, you can define a custom stream to receive events. See Alert Builder for details.

See LAMbot Configuration for more information on the presend function.LAMbot Configuration

The optional modules property can be used to provide a list of JavaScript files that are loaded into the context of the LAMbot and executed. It allows LAMs to share modules. For example, you can write a generic Syslog processing module that is used in both the Socket LAM and the Logfile LAM. This reduces the need for duplicated code in each LAMbot.

Conversion Rules

Conversion rules are used by Cisco Crosswork Situation Manager to convert received data into a usable format, including severity levels and timestamps.

Severity

The following example looks up the value of severity and returns the mapped integer.

conversions:
{
        sevConverter:
        {
                lookup : "severity",
                input   : "STRING",
                output : "INTEGER"
        },
},
constants:
{
        severity:
        {
                "CLEAR"                       : 0,
                "INDETERMINATE"               : 1,
                "WARNING"                     : 2,
                "MINOR"                       : 3,
                "MAJOR"                       : 4,
                "CRITICAL"                    : 5,
                moog_lookup_default     : 3
        }
}

In the above example:

· conversions receives a text value for severity.

· sevConverter uses a lookup table "severity" to reference a table named severity defined in the constants section.

· The integer value matching the text value is returned.

· moog_lookup_default is used to specify a default value when a received event does not map to a listed value.

For example, the text value "MINOR" is received and the integer value 3 is returned.

If moog_lookup_default is not used and a received event severity does not map to a specifically listed value, the event is not processed.

See Severity Reference for more information about the severity levels in Cisco Crosswork Situation Manager.

Time

Time conversion in Cisco Crosswork Situation Manager supports the Java platform standard API specification. See Simple Date Format for more information.

Some Unix time formats are indirectly supported and LAM logging indicates any automatic conversion that occurred at startup.

The only PCRE/Perl modifier automatically converted is the lone 'U' ungreedy modifier, PCRE's '-U' is not supported. If the pattern contains a -U it should be removed manually.

You can specify a time zone configuration so the LAM parses the incoming timestamps with the expected time zone. For example:

conversions:
{
        timeUnitConverter:
        {
                timeUnit        : "MILLISECONDS",
                input           : "STRING",
                output          : "INTEGER"
        },
        timeConverter:
        {
                timeFormat      : "%Y-%m-%dT%H:%M:%S",
                timeZone        : "UTC",
                input           : "STRING",
                output          : "INTEGER"
        }
}

You can specify the timezone name or abbreviation. See List of TZ Database Time Zones for the full list.

JSON Events

The other capability of all LAMs is the native ability to consume JSON events. You must have a start and end carriage return as it is expecting a whole JSON object following the carriage return.

Under parsing you have:

end: ["\n"],

For the delimiter you have:

delimiter: ["\r"]

JSON is a sequence of attribute/value, and the attribute is used as a name. Under mapping, you must define the following attribute builtInMapper: "CJsonDecoder". It automatically populates, prior to the rules being run, all of the values contained in the JSON object.

For example if the JSON object to be parsed was:

{"Node" : "acmeSvr01","Severity":"Major"...}\n

The attributes available to the rules in the mapping section would be xNode="acmeSvr01", $Severity="Major" and so on.

Polling LAMs With Multiple Target Support

Polling LAMs that support multiple targets contain the targets proper in the configuration file. The following Polling LAMs have multiple target support:

· CA Spectrum

· DataDog Client

· Dynatrace APM

· Email

· HP NNMi

· HP OMi

· JDBC

· New Relic

· New Relic Insight

· Rest Client

· SevOne

· SolarWinds

· VMware vCenter

· VMware vRealize Log Insight

· VMware vSphere

· Zabbix

· Zenoss

For these LAMs, the event payload includes the target name and target URL. These are written to custom_info.eventDetails.moog_target_name and custom_info.eventDetails.moog_target_url:

var overflow = commonUtils.getOverflow(event);
    event.set("overflow", null);

    var eventDetails = {
        "moog_target_name": overflow.moog_target_name,
        "moog_target_url": overflow.moog_target_url
    };
    event.setCustomInfoValue("eventDetails", eventDetails);

These may be available in the LAMbot functions, and can be enabled or disabled if required.

Severity Reference

Severity is a measure of the seriousness of an event and indicates how urgently it requires corrective action.

Cisco Crosswork Situation Manager LAMs and integrations use six industry standard severity levels as follows:

· 0: Clear - One or more events have been reported but then subsequently cleared, either manually or automatically.

· 1: Indeterminate - The severity level could not be determined.

· 2: Warning - A number of faults with the potential to affect services have been detected.

· 3: Minor - A fault that is not affecting services has been detected. Action may be required to prevent it from becoming a more serious issue.

· 4: Major - A fault is affecting services and corrective action is required urgently.

· 5: Critical - A serious fault is affecting services and corrective action is required immediately.

The severity mapping is set in each LAM configuration file:

severity:
{
        "CLEAR"               : 0,
        "INDETERMINATE" : 1,
        "WARNING"             : 2,
        "MINOR"               : 3,
        "MAJOR"               : 4,
        "CRITICAL"            : 5,
}

The LAM takes the severity string in a received event and translates it into one of the above integer values using the mapping in its configuration file:

sevConverter:
{
        lookup : "severity",
        input   : "STRING",
        output : "INTEGER"
},
mapping:
        rules:
        [
                { name: "severity", rule: "$severity",conversion:"sevConverter"},
        ]

You can customize the severity section of the LAM configuration file according to the severities used in the system sending events to Cisco Crosswork Situation Manager. In the following example, events sent to the LAM with non-standard severities 'info' and 'Information' are mapped to 'INDETERMINATE' in Cisco Crosswork Situation Manager:

severity:
{
        "info"                : 1,
        "Information"         : 1,
        "user"                : 1,
        "warning"             : 2,
        "Warning"            : 2,
        "error"               : 5,
        moog_lookup_default     : 1
}

The moog_lookup_default property specifies a default value to use when the severity does not match any of the defined strings. If you do not set a default, events with an unmapped severity are not processed. For more information on mapping see "Conversion Rules" in Data Parsing.

Cisco Crosswork Situation Manager determines a Situation's severity from the member alert with the highest severity level.

Configure Data Processing

Moogfarmd is the core system application that runs all of the algorithms and automation relevant to Cisco Crosswork Situation Manager. It is responsible for the following:

· Creating alerts.

· Analyzing alerts to determine their significance.

· Clustering alerts into Situations.

· Performing automation relating to the automated response such as escalation, routing, notification, invitation of either alerts or Situations.

The topics in this guide help you configure the data processing components of Moogfarmd:

You can run one or many instances of Moogfarmd on your Cisco Crosswork Situation Manager system.

Services

The Cisco Crosswork Situation Manager installation installs Moogfarmd as a service:

/etc/init.d/moogfarmd

A backup Moogfarmd service script is located at $MOOGSOFT_HOME/etc/service-wrappers/moogfarmd.

If you run multiple instances of Moogfarmd on the same host, copy and modify the default Moogfarmd service script for each Moogfarmd running on the host:

1. Copy $MOOGSOFT_HOME/etc/service-wrappers/moogfarmd to /etc/init.d/mymoogfarmd.

2. Edit the following parameters in the /etc/init.d/mymoogfarmd file:

SERVICE_NAME=mymoogfarmd
CONFIG_FILE=$PROCESS_HOME/config/my_moog_farmd.conf

3. You now have a new service to be used to start your own specific Moogfarmd:

service mymoogfarmd start

For information on starting, stopping and configuring Moogfarmd, see the Moogfarmd Reference.

Alert Processing

Cisco Crosswork Situation Manager processes alerts using the following backend components. For alert processing capabilities using Workflow Engine in the Cisco Crosswork Situation Manager UI, see /document/preview/110725#UUID3bd5018041a19de941d95733dffc3e37 and its related topics.Workflow Engine

These components are responsible for performing analysis, adding information to alerts, and noise reduction techniques.

· Events Analyser: A standalone process that analyses tokens in events and assigns each token an entropy value. The Events Analyser can use any text field in an event but, by default, it uses the event's description. This process runs periodically and does not form a part of the alert processing workflow.

· Alert Builder: Processes events from the Message Bus. It:

— Deduplicates events into alerts.

— Calculates the entropy of alerts.

· Enricher: Enriches alerts with additional information.

· Maintenance Window Manager: Marks alerts as 'In maintenance' if they match a scheduled maintenance window filter. You can set up maintenance windows for planned maintenance, such as scheduling a fix or regular maintenance of a system.

· Alert Rules Engine: Allows conditional processing of alerts, such as managing link up/link down processing. Before you configure the Alert Rules Engine, read about the /document/preview/110725#UUID3bd5018041a19de941d95733dffc3e37 which is a powerful and flexible tool for data processing available in the Cisco Crosswork Situation Manager UI.Workflow Engine

· Empty Moolet: An optional component that enables further processing of alerts or Situations. It usually runs as a standalone process but it can also be embedded in the processing chain. Cisco Crosswork Situation Manager provides an example Empty Moolet in the form of an Alert Manager.

The following diagram shows the alert processing components in a typical implementation of a workflow chain in Cisco Crosswork Situation Manager:

Related image, diagram or screenshot

Each component comprises a Moolet supplemented by Moobots.

Events Analyser

· Stream and Partition-based Analysis

· Natural Language Processing Analysis

— Tokenization of Text

— Token Type Identification

— Token Masking

— Language Processing Techniques

— Priority Words

— Token Variation Threshold

The Events Analyser utility is a standalone process. It uses Natural Language Processing (NLP) techniques to analyze inbound event data. The Events Analyser divides text fields within the events into tokens. Based on the frequency of these tokens appearing in other events, it assigns an entropy value to the tokens and to the alerts in Cisco Crosswork Situation Manager. See the Entropy Overview for more information on how Cisco Crosswork Situation Manager evaluates entropy and uses entropy thresholds to reduce the level of 'noise' from incoming event data.

Stream and partition-based analysis

You can configure Cisco Crosswork Situation Manager so that the Events Analyser calculates the entropy values for events from different streams for Cisco Crosswork Situation Manager as a whole, even though those streams have no relationship with each other.

You can also configure the Events Analyser so that it calculates the entropy values for events for different partitions. As an example, you may want to run separate entropy calculations for different regions. In this case, you should specify the alert field that identifies the region in the partition_by field in the Events Analyser configuration file. In this type of configuration, the same token can be given multiple entropy values within the same Moogfarmd deployment based on its frequency in the events within each partition. You can set up different configuration options for the different partitions. For example, in a particular partition, IP addresses may be masked whilst for another partition that may be unnecessary. In general, if a deployment uses the “pre-partition” method in Moogfarmd, that deployment benefits from partition-based entropy calculations.

See Multiple Streams and Partitions for more information on running the Events Analyser with different streams and partitions. See Configure Events Analyser for further information on non-partitioned and partitioned configurations.

Natural language processing analysis

The Events Analyser utility performs a number of linguistic analyses on events. It then uses this linguistic analysis to calculate an entropy value for each token and then for every alert. See the Entropy Overview for more information.

Tokenization of text

The Events Analyser splits a text string at word boundaries, such as spaces or punctuation marks, into blocks. Each block of text is known as a token. For example, the following description has five tokens:

Link down on port 2/32

Token type identification

Commonly used word boundaries are often integral to the meaning of a token, for example, dots in IPV4 addresses. The Events Analyser identifies complete tokens of the following types within the structure of an event:

· IP addresses:

— v4

— v6

· MAC addresses

· OIDs

· Dates: Most standard formats.

· Numbers:

— Integers

— Real numbers

— With and without unit suffixes, for example, 99%, 12kb, 345ms.

· File paths:

— Forward slashes

— Backward slashes

· GUIDs

· Hexadecimal numbers: With the 0x prefix.

· URLs

· Email addresses: Most standard formats.

Identifying token types in arbitrary text is not an exact science and so, occasionally, the algorithms may identify tokens as a certain type which seems incorrect to a human.

After the Events Analyser has identified the token types, it can use them for masking and to identify tokens with high variation in a given alert.

Token masking

Tokens that change between events for the same alert can cause that alert to be assigned an incorrectly high entropy value. The most obvious example involves dates and times. If the description of an event is to be analyzed but each event contains a different timestamp, that timestamp will have a high entropy and skew the entropy for that alert as a whole. For other token types that change frequently, such as URLs or IP addresses, it may be desirable to retain the higher entropy associated with that token type because the changing value is significant.

You can configure the Events Analyser to include or exclude specific token types in the entropy analysis for each event partition.

You should consider masking dates, times and numbers from the entropy calculation.

Language processing techniques

The Events Analyser uses many standard techniques in language processing:

· Case folding

— Tokens that differ only by case, for example, 'WORD', 'Word' or 'word', are converted to the same case and considered equal.

— Case folding is applied to all token types.

· Stop words

— You can add common or meaningless words, such as 'a', 'be', 'not', to a stop words file so that they are removed from the entropy calculation.

— You can define a universal 'length' parameter so that any word at or below a certain length is treated as a stop word. For example, if set to '2', any words of one or two characters are ignored.

— Stop words are applied to all token types.

· Stemming

— A technique used to reduce a word to its root to remove plurals or different tenses in verbs. Words with the same root are considered equal.

— Note that some words, when stemmed, look unusual. For example, 'priority', 'priorities', prioritize, get stemmed to 'priorit'.

— If stemming is enabled, the stemmed form is stored in the reference database.

— Stemming is only applied to tokens of type 'word', that is, it is not applied to numbers, GUIDs, IP addresses, etc.

Priority words

Priority words are similar in concept to stop words but, rather than removing that word from the analysis as occurs with stop words, a priority word is assigned an entropy value of 1. For example, if ‘reboot’ is defined as a priority word, any tokens containing the word ‘reboot’ are given an entropy value of 1 regardless of how frequently the word appears in events.

Note:

Priority words are analyzed after stop words. If a token satisfies the criteria of a stop word, it is removed from the analysis and so cannot subsequently be considered as a priority word.

The reference database contains the calculated entropies for all tokens regardless of whether they are classed as priority words.

Token variation threshold

Token variation threshold analysis involves the different forms of each field and how the tokens in those different forms vary between events in the same alert. This is most easily explained by an example. Assume that all token masking is off and that an alert consists of the following six events:

QDepth beyond 90% threshold on host = 22222

QDepth beyond 90% threshold on host = 44444

QDepth beyond 90% threshold on host = 11111

QDepth beyond 90% threshold on host = 44444

The value for the host is changing between events, there are three occurrences of 44444 and one occurrence of each of the other values. Values that appear infrequently can skew the entropy value for the alert. In order to prevent this skewing, you can apply a threshold. The threshold is a ratio between 0 and 1, where 0 implies that a token can appear only once and still contributes to the entropy calculation, while a value of 1 implies that the value must be the same in every event before it is considered. If the threshold is set to 0.5, the value 44444 would contribute to the entropy, but the values 11111 and 22222 would not, because only the value 44444 appears in half of the events in the alert.

The Events Analyser performs this analysis for each form of each field within each event of every alert.

This configuration option has no effect unless the Events Analyser uses the EntropyClassic algorithm. The EntropyV2 algorithm is more robust to small variations in the wording, and variations in the metadata such as IP addresses and timestamps, so there is no need to have a manual parameter to tune this.

Entropy Overview

Entropy is defined as the degree of disorder or randomness in a system. In Cisco Crosswork Situation Manager, entropy is a measure of how unexpected or unpredictable an event or an alert is. According to information theory, the more unpredictable or unexpected an event is, the more information it is deemed to carry. Therefore, entropy is a measure of the amount of information contained in an event.

The Events Analyser utility is a standalone process that assigns an entropy value to an event token based on its uniqueness. The Alert Builder assigns an entropy value to each alert based on the token entropies. The entropy value is a numeric value between 0 and 1 (accurate to 16 decimal places). It provides an indication of how important an alert is. An entropy value of 0 means that the alert is just ‘noise’ and a value of 1 means that the alert is significant. You can configure the clustering algorithms to ignore common alerts with a low entropy value; this reduces ‘noise’ in Cisco Crosswork Situation Manager. See the /document/preview/11776#UUID78f3c171ff4093b4dce3b6750fd89e09 for more information.Clustering Algorithm Guide

How Cisco Crosswork Situation Manager evaluates entropy

The Events Analyser utility analyzes the text attributes of events to assign a semantic entropy value. In the default Cisco Crosswork Situation Manager implementation, the Events Analyser uses the description field but you can configure it to use other text fields. The Events Analyser divides the text in between spaces into tokens. For example, the following description has five tokens:

Link down on port 2/32

The Events Analyser calculates the entropy of each token and stores the token in the Cisco Crosswork Situation Manager reference database with its associated entropy value. Initially, a new token has a value of 1. The Events Analyser reduces this entropy value as more events occur which contain the same token.

You can configure the Events Analyser to mask volatile token types, such as dates, times, numbers, URLs or IP addresses, so that they are not included in the tokens. See the Events Analyser for further details of the analysis it performs.

The Alert Builder uses the entropy value of the tokens within an alert to calculate the entropy of that alert.

The Events Analyser uses the EntropyV2 calculation method in the default Cisco Crosswork Situation Manager implementation. The EntropyV2 method calculates entropy values in real-time based on any tokens it has encountered before. The Alert Builder assigns the entropy of an alert based on the entropy value of the tokens within the alert rather than the entire database. Tokens within an alert which occur frequently contribute negatively to the entropy of an alert, indicating that the alert may not be as significant as an alert with tokens that are seen less frequently. This is in contrast to the EntropyClassic algorithm where the entropy of each alert takes into consideration the significance of tokens in the entire database.

Note:

Cisco recommends using the EntropyV2 algorithm to produce better alert entropy values than with the EntropyClassic algorithm.

If the Alert Builder receives an event with a token that it has encountered before, from a previous run of the Events Analyser, it sets the alert entropy to match the value saved in the reference database. If the Alert Builder receives an event with a token that it has not encountered before, it calculates the entropy value in real-time and applies this value to the alert. The Alert Builder also saves the entropy value in the reference database for future retrieval.

The Events Analyser stores data in memory while it calculates entropy values. It is important that the Events Analyser runs frequently to ensure that it does not fail with a memory outage. See Run Events Analyser for more information on running the Events Analyser.

Set an entropy threshold

You can set an entropy threshold in each Sigaliser so that only alerts with a higher entropy value are included in Situations. To decide on the value of your entropy threshold, consider the distribution of entropy values in the alerts. A typical entropy value distribution is show in the following diagram:

Related image, diagram or screenshot

Cisco recommends that you set your entropy threshold to a value on the downward slope of the peak to exclude the majority of alerts. In this example, the entropy threshold is set at 0.21. This reduces the level of ‘noise’ so that you are only clustering the important alerts, with an entropy value greater than the threshold, into Situations.

You can define entropy thresholds in the clustering algorithms to exclude alerts which have an entropy value that is lower than the threshold. This prevents Cisco Crosswork Situation Manager from including unimportant 'noisy' alerts in Situations. See the /document/preview/11776#UUID78f3c171ff4093b4dce3b6750fd89e09 for more information.Clustering Algorithm Guide

Vertex Entropy

Vertex Entropy uses a different form of entropy, topological entropy, to establish how critical the nodes are in your network topology. You can use Vertex Entropy calculations within Cookbook to create Situations which cluster alerts from important nodes. See /document/preview/11796#UUID8635a39b79fdd302137e104ae42562e8 for more information.Vertex Entropy

Configure Events Analyser

You can configure the Events Analyser to analyze all the event data received by Cisco Crosswork Situation Manager together or to analyze event data by partitions or streams. See the Events Analyser for more information on these options.

· #UUID20fd71956d14cf891fc4155392b9a706_id_ConfigureEventsAnalyserNon_PartitionedExampleofNonPartitionedData

· #UUID20fd71956d14cf891fc4155392b9a706_id_ConfigureEventsAnalyserPartitionedExampleofPartitionedData

· #UUID20fd71956d14cf891fc4155392b9a706_id_ConfigureEventsAnalyserDisablingEntropyCalculations

Edit the configuration file at $MOOGSOFT_HOME/config/events_analyser.conf to control the behavior of the Events Analyser.

See the Events Analyser Reference for a full description of all properties. Some properties in the file are commented out by default. Uncomment the properties to enable them.

To configure the Events Analyser:

1. The default configuration uses the EntropyV2 calculation method. Cisco recommends using the default EntropyV2 calculation method for calculating entropy values because it has improved modelling of alert probabilities. However, if you want to, you can change the setting to use the EntropyClassic calculation method. Entropy data for EntropyClassic and EntropyV2 calculation methods are not compatible. If you switch between the two calculation methods, you must execute a full priming run of the Events Analyser after you have changed the setting to ensure that all the entropy data matches the same configuration. See Run Events Analyser for further details on executing a full priming run of the Events Analyser.

2. Use the default values for the priming_source_data.

3. Configure whether or not the Events Analyser partitions the entropy data. See the #UUID20fd71956d14cf891fc4155392b9a706_id_ConfigureEventsAnalyserNon_PartitionedExampleofNonPartitionedData and the #UUID20fd71956d14cf891fc4155392b9a706_id_ConfigureEventsAnalyserPartitionedExampleofPartitionedData for further details.

4. Configure the "default" Events Analyser behavior. See the #UUID20fd71956d14cf891fc4155392b9a706_id_ConfigureEventsAnalyserNon_PartitionedExampleofNonPartitionedData for further details.

5. If using partitioned data, configure the Events Analyser for any partitions that you want to behave differently. If you do not add a separate configuration for a partition, the Events Analyser uses the "default" configuration for that partition. The Events Analyser also uses the "default" configuration for any properties that are not defined in a partition configuration. See the #UUID20fd71956d14cf891fc4155392b9a706_id_ConfigureEventsAnalyserPartitionedExampleofPartitionedData for further details.

Example of non-partitioned data

The default configuration file at $MOOGSOFT_HOME/config/events_analyser.conf, similar to the example shown below, contains a non-partitioned configuration. The "partition_by" property has been set to null to show that the entropy data is not to be partitioned. The "default" settings have been configured for all entropy values. See the Events Analyser Reference for further information on these properties.

{
    "entropy_calc": "EntropyV2",
    "priming_source_data" :
            {
                "alerts_table" : "alerts",
                "events_table" : "events",
                "snapshots_table" : "snapshots",
                "timestamp_column" : "last_event_time"
              },
    "partition_by" : null,
    "default" :
        {
            "fields" :
                [
                    "description"
                 ],
            "mask" :
                {
                    "ip_address"    : false,
                    "mac_address"   : false,
                    "oid"           : false,
                    "date_time"     : true,
                    "number"        : true,
                    "path"          : false,
                    "guid"          : false,
                    "hex"           : false,
                    "url"           : false,
                    "email"         : false,
                    "word"          : false,
                    "stop_word"     : false
                },
            "casefold" : true,
            "stop_words" : true,
            "stop_word_length" : 0,
            "stop_word_file" : "stopwords",
            "priority_words" : false,
            "priority_word_file" : "prioritywords",
            "stemming" : false,
            "stemming_language" : "english"
        }

Example of partitioned data

The example below shows additional configuration of the Events Analyser for two partitions "san_francisco" and "new_york". These settings override the "default" configuration in the example of non-partitioned data above.

In this example, the source field is used to partition the entropy data:

"partition_by" : "source",

The configuration for the "san_francisco" partition uses the description, agent and source fields for calculating entropy values and does not use stop words. The "new_york" partition uses different masking properties to the "default" configuration: date_time is not masked but ip_address, email, and url are masked. This partition also uses stemming for calculating entropy values. Since the language is not specified, the default of English is used. All other properties that have not been configured in these partitions will use the properties in the "default" configuration.

If there are any other partitions, for example, "los_angeles", that do not have any properties specified in the configuration file, they will use the "default" configuration.

See the Events Analyser Reference for further information on these properties.

, "partition_overrides" :
        {
           "san_francisco" :
               {
                   "fields" :
                       [
                               "description", "agent", "source",
                       ],
                   "stop_words" : false
               },
           "new_york" :
               {
                   "mask" :
                       {
                           "date_time" : false,
                           "ip_address" : true,
                           "email" : true,
                           "url" : true
                       },
                   "stemming" : true
               }
       }

Disabling entropy calculations

Cisco recommends that you configure the clustering algorithms to use entropy thresholds so that they exclude 'noisy' alerts which contain low levels of important information. This allows operators to concentrate on Situations containing important alerts. See the /document/preview/11776#UUID78f3c171ff4093b4dce3b6750fd89e09 for more information. However, if you do not intend to use entropy calculations, you should:Clustering Algorithm Guide

· Set the 'entropy_calc' property to 'EntropyClassic'.

· Set the 'properties_from_db' property to 'false' for all running Alert Builder Moolets.

Run Events Analyser

The Events Analyser is responsible for analysing the tokens within alerts and calculating their entropy values. The Events Analyser updates the alerts with the calculated entropy value and also updates the reference database with all the tokens and their associated entropy values.

· Command Line Options

· Run Events Analyser

— Daily Run

— Hourly Run

— Run Events Analyser Manually

· Multiple Streams and Partitions

· Usage Examples

Command line options

The events_analyser command line executable accepts the following options:

Option	Input	Description
--config <arg>	String: <file path/name>	Name and path of the configuration file specific to running the Events Analyser. The default is events_analyser.conf. Example: --config=$MOOGSOFT_HOME/etc/events_analyser.conf
-l, --loglevel <arg>	One of: ALL, INFO, WARN, NONE	Specifies the amount of logging information. Defaults to WARN, which is the recommended level in all production implementations.
--incremental	-	Analyzes only new event data that was received since the last time the Events Analyser was run.
--readage <arg>	Number, followed by one of: s (seconds) m (minutes) h (hours) d (days) w (weeks)	Amount of data to analyse, in seconds, minutes, hours, days or weeks. Example: --readage 2w
--keepage <arg>	Number, followed by one of: s (seconds) m (minutes) h (hours) d (days) w (weeks)	Amount of data to keep, in seconds, minutes, hours, days or weeks. Example: --keepage 30d
--stream <arg>	String: <alert stream name>	Stream name to be given to the current analysis. Example: --stream "PRIMARY"
--partition <arg>	String: <partition value>	Name of the partition to be analyzed. It must be a valid value of the partition_by field. Example: --partition "SanFrancisco"

Run Events Analyser

Cisco recommends that you run the Events Analyser regularly as follows:

· Daily: analyzes the last two weeks of data.

· Hourly: in incremental mode which analyses all new event data since the last time the Events Analyser was run.

These default settings are specified in moog_init_server.sh.

You can also run the Events Analyser manually on an ad hoc basis.

Daily run

To initiate a daily run, that is, where all entropy values are calculated for the last two weeks of event data, you should specify the Events Analyser to run with the following command line options:

./events_analyser --readage 2w

In this case, the Events Analyser:

· Uses the default configuration file $MOOGSOFT_HOME/etc/events_analyser.conf.

· Analyzes all data received in the last two weeks, based on the timestamp_column property in the events_analyser.conf file.

· Adds all analyzed data to the reference database for the default stream.

· Leaves any data for other, named streams unchanged.

Hourly run

The Events Analyser utility provides the ability for incremental priming. When the Events Analyser utility is run repeatedly with the --incremental option, each subsequent run of the utility analyses the event data starting from the last analyzed event. For example, if the first run analyzes data up to event ID = 666, the next incremental run of the utility analyzes data from 667 to say 999, the third incremental run reads in data from event ID 1000, and so on.

To initiate an hourly run, that is, where all entropy values are calculated since the last analyzed event, you should specify the Events Analyser to run with the following command line options:

./events_analyser --incremental

In this case, the Events Analyser:

· Uses the default configuration file $MOOGSOFT_HOME/etc/events_analyser.conf.

· Analyzes all data since the last incremental run, based on the timestamp_column property in the events_analyser.conf file.

· Adds all analyzed data to the reference database for the default stream.

· Leaves any data for other, named, event streams unchanged.

Run Events Analyser manually

To run the Events Analyser manually, you can run it without any command line options. This command runs the Events Analyser for all new event data received in the last two weeks or since the last analysis, whichever is most recent.

./events_analyser

In this case, the Events Analyser:

· Uses the default configuration file $MOOGSOFT_HOME/etc/events_analyser.conf.

· Analyzes all event data received in the last two weeks or since the last time the Events Analyser was run, whichever is most recent, based on the timestamp_column property in the event_analyser.conf file.

· Adds all analyzed data to the reference database for the default stream.

· Leaves any data for other, named, event streams unchanged.

To run the Events Analyser to analyze event data over a longer period, you should include the --readage option. In this example, the --readage option is set to 13 weeks:

./events_analyser --readage 13w

In this case, the Events Analyser:

· Uses the default configuration file $MOOGSOFT_HOME/etc/events_analyser.conf.

· Analyzes all event data received in the last 13 weeks.

· Adds all analyzed data to the reference database for the default stream.

· Leaves any data for other, named, event streams unchanged.

Note:

If you use a large value in the--readageoption, you may find that the Events Analyser fails to complete the analysis. If this occurs, rerun it using a shorter period of time.

Multiple streams and partitions

You can run the Events Analyser for specific streams or partitions. In this example, the --stream option is specified to add the analyzed data to the "SECONDARY" event stream. The --readage option restricts the data analyzed to the last eight weeks of event data.

./events_analyser --stream “SECONDARY” --readage 8w

In this case, the Events Analyser:

· Uses the default config file $MOOGSOFT_HOME/etc/events_analyser.conf.

· Analyzes all event data received in the last eight weeks, based on the timestamp_column property in the event_analyser.conf file.

· Adds all analyzed data to the reference database for the “SECONDARY” event stream.

· Leaves data for all other, named, event streams unchanged.

You can use the --partition option to limit the data that is analysed to a specified partition. In this example, the --readage option restricts the data analyzed to the last four weeks of event data:

./events_analyser --stream “SECONDARY” --partition “SanFrancisco” --readage 4w

In this case, the Events Analyser:

· Uses the default config file $MOOGSOFT_HOME/etc/events_analyser.conf.

· Analyzes all event data received in the last four weeks for the “SanFrancisco” partition only.

· Adds all analyzed data to the reference database for the “SanFrancisco” partition in the “SECONDARY” event stream.

· Leaves data for all other event streams and partitions unchanged.

Note:

Cisco recommends that you always use the --readage option when analyzing streams or partitions to ensure that the Events Analyser processes the required amount of data. If the --readage option is not specified, the Events Analyser only analyzes new event data received in the last two weeks or since the last analysis, whichever is the most recent, regardless of whether this was for a different stream or partition.

Usage examples

There are many combinations of command line options. Some common usage scenarios include:

Command Line Options

Typical Use Case

<none>

Events analysis to be run incrementally.

Uses the default configuration for all new event data in the last two weeks or since the last analysis, whichever is most recent.

Updates the reference database with the new data for the default stream.

--readage 4w

Events analysis to be run nightly.

Uses the default configuration for the last four weeks of event data.

Updates the reference database with the new data for the default stream.

--incremental

An incremental events analysis to be run hourly.

Uses the default configuration for the all new event data since the last run.

Updates the reference database with the new data for the default stream.

--incremental

--keepage 2w

An incremental events analysis to be run hourly.

Uses the default configuration for all new event data received since the last run.

Removes all data from the reference database for the default stream that is more than two weeks old.

--stream “PRIMARY”

--partition “London” --readage 13w

Performs an events analysis analyzing only those events in the “London” partition.

The data is written to the “PRIMARY” event stream.

Data for all other streams remains unchanged.

Data for all other partitions in the “PRIMARY” stream remains unchanged.

Events Analyser Reference

This is a reference for the Events Analyser utility. The Events Analyser configuration properties are found in $MOOGSOFT_HOME/config/events_analyser.conf.

entropy_calc

Entropy calculation method. Cisco recommends using the EntropyV2 calculation method for more accurate entropy values.

Type: String

Required: Yes

One of: EntropyV2, EntropyClassic

Default: "EntropyV2"

priming_source_data

Source data to use when priming the entropy value database table, that is, running the Events Analyser to calculate entropy values. By default, the priming source data is taken from tables in the main database schema called moogdb. timestamp_column is a column in the snapshots_table.

Type: String

Required: Yes

Default:

{
    "alerts_table" : "alerts",
    "events_table" : "events",
    "snapshots_table" : "snapshots",
    "timestamp_column" : "last_event_time"
}

partition_by

Identifies the properties in each event that is used to partition them so that they are grouped separately by the Sigalisers. If partitioning is enabled, the following properties can be configured independently for each partition. See Configure Events Analyser for further details on partitions and configuration examples.

Type: String

Required: Yes

Default: null

Example: "partition_by" : "source"

fields

Properties in each event that contribute to the entropy value calculation.

Type: List of strings

Required: Yes

Default: "description"

mask

Token types to be included or excluded from entropy calculations. If a token type is set to false, the entropy calculation includes it. If it is set to true, the entropy calculation excludes the token type. Masking token types, such as dates or numbers, ensures that tokens are not given a higher entropy value than they should have because of unique numbers or dates.

Type: Boolean

Required: No

Default:

{
    "ip_address" : false,
    "mac_address" : false,
    "oid" : false,
    "date_time" : true,
    "number" : true,
    "path" : false,
    "number" : false,
    "path" : false,
    "guid" : false,
    "hex" : false,
    "url" : false,
    "email" : false,
    "word" : false,
    "stop_word" : false
}

casefold

Whether tokens that differ only by case should be considered the same in entropy calculations.

Type: String

Required: Yes

Default: true

stop_words

Whether specific tokens should be ignored in entropy calculations. Stop words are small common words such as 'about', 'at' or 'the'.

Type: String

Required: Yes

Default: true

stop_word_length

Any token of this length or shorter is considered a stop word and is excluded from entropy calculations. The default of 0 means that no words are considered as stop words.

Type: Number

Required: Yes

Default: 0

stop_word_file

Path (optional) and name of the file containing a list of stop words to be excluded from entropy calculations. If you provide a file name only, the Events Analyser assumes the path $MOOGSOFT_HOME/config/. The Events Analyser uses the full path if you provide it. The default Cisco Crosswork Situation Manager implementation provides a file named stopwords in $MOOGSOFT_HOME/config/, which contains a list of common stop words.

Type: String

Required: Yes

Default: "stopwords"

priority_words

Whether priority words are included in entropy calculations. Alerts containing priority words are automatically given a maximum entropy value of 1.

Type: String

Required: Yes

Default: false

priority_word_file

Path (optional) and name of the file containing a list of stop words to be excluded from entropy calculations. If you provide a file name only, the Events Analyser assumes the path $MOOGSOFT_HOME/config/. The Events Analyser uses the full path if you provide it. The file prioritywords in $MOOGSOFT_HOME/config/ is empty in the default Cisco Crosswork Situation Manager implementation.

Type: String

Required: Yes

Default: "prioritywords"

stemming

Whether words with the same word stem are to be considered as the same word in entropy calculations. For example, should 'fail', 'failed' and 'failing' all be considered as the same word.

Type: String

Required: Yes

Default: false

stemming_language

Language used in the events.

Type: String

Required: Yes

Default: "english"

Alert Builder

The Alert Builder Moolet assembles alerts from incoming events, sent by the LAMs across the Message Bus. These alerts are visible through the Alert View in the User Interface (UI). The Alert Builder Moolet is also responsible for:

· Updating all the necessary data structures.

· Ensuring copies of the old alert state are stored in the snapshot table in MoogDb, relevant events are created and the old alert record is updated to reflect the new events arriving into Cisco Crosswork Situation Manager.

Configure Alert Builder

Edit the configuration file at $MOOGSOFT_HOME/config/moolets/alert_builder.conf.

See Alert Builder Reference for a full description of all properties. Some properties in the file are commented out by default.

Example Configuration

The following example demonstrates a simple Alert Builder configuration:

{
      name                        : "AlertBuilder",
      classname                   : "CAlertBuilder",
      run_on_startup              : true,
      moobot                      : "AlertBuilder.js",
      event_streams               : [ "AppA" ],
      threads                     : 4,
      metric_path_moolet          : true,
      events_analyser_config      : "events_analyser.conf",
      priming_stream_name         : null,
      priming_stream_from_topic   : false
}

Alert Builder Moobot

The Moobot, AlertBuilder.js, is associated with the Alert Builder Moolet. It undertakes most of the activity of the Alert Builder. When the Alert Builder Moolet processes an event, it calls the JavaScript function, newEvent:

events.onEvent ( "newEvent" , constants.eventType( "Event" )).listen();

The function newEvent contains a call to create an alert. The newly created alert is broadcast on the Message Bus.

See Moobot Modules for further information about Moobots.Moobot Modules

Alert Builder Reference

This is a reference for the Alert Builder Moolet.

You can change the behavior of the Alert Builder by editing the configuration properties in the $MOOGSOFT_HOME/config/moolets/alert_builder.conf configuration file. It contains the following properties:

name

Name of the Alert Builder Moolet. Do not change.

Type: String

Required: Yes

Default: "AlertBuilder"

classname

Moolet class name. Do not change.

Type: String

Required: Yes

Default: "CAlertBuilder"

run_on_startup

Determines whether the Alert Builder runs when Cisco Crosswork Situation Manager starts. By default, it is set to true, so that when Moogfarmd starts, it automatically creates an instance of the Alert Builder. In this case you can stop it using farmd_ctrl.

Type: Boolean

Required: Yes

Default: true

moobot

Specifies a JavaScript file found in $MOOGSOFT_HOME/moobots, which defines the Alert Builder Moobot, which creates alerts.

Type: String

Required: Yes

Default: AlertBuilder.js

metric_path_moolet

Determines whether or not Cisco Crosswork Situation Manager includes the Alert Builder in the Event Processing metric for Self Monitoring.Self Monitoring

Type: Boolean

Required: Yes

Default: true

event_streams

A list of event streams, which the Alert Builder Moolet processes in this instance of Moogfarmd. The LAMs can be configured to send events on different streams. Moogfarmd, as specified in the Alert Builder configuration, then decides whether or not to process them. If Cisco Crosswork Situation Manager runs multiple Moogfarmds, you can have different event streams being processed by different Alert Builder Moolets.

You can comment out event_streams, or provide an empty list. Then, the Alert Builder processes every event that is published on the default /Events topic on the Message Bus.

You configure the Alert Builder Moolet by giving it a list of strings, for example, [ “App A”, “App B” ]. The result is that the Alert Builder listens for events published on /Events/AppA, and /Events/AppB, and processes that data. Importantly, in this example, events published to /Events or any other stream are ignored. You can have Moogfarmds that process completely separate event streams, or, multiple Moogfarmds that process some different event streams and some common event streams. You would do this when some of the alerts are common to all the applications that are being processed, but some are specific only to a given application. In this way, you can cluster alerts separately for each application by configuring the Sigalisers to only processes alerts from a specific upstream Alert Builder Moolet.

For example, if you have two separate applications that share the same network infrastructure: in Moogfarmd 1, you can have as the event streams, application A and networks, and, in Moogfarmd 2, you can have application B and networks. With this configuration, you can detect alerts and then create Situations that are relevant for just application A and similarly just for application B; however, if there is common networking infrastructure and problems occur with network failures across applications A and B, the Alert Builder can cluster these into Situations.

Type: String

Required: No

Default: [ "AppA" ]

threads

Specifies the number of threads in the Alert Builder. Choose a value to match the event rate experienced by your system that allows time for alert creation.

Type: String

Required: Yes

Default: 4

events_analyser_config

Allows you to specify a different Events Analyser configuration, for tokenizing and analysis rules, for each Alert Builder Moolet. If no configuration file is specified, the system default events_analyser.conf is used.

Type: String

Required: No

Default: "events_analyser.conf"

priming_stream_name

Stream name under which the Events Analyser runs in order to calculate token and alert entropies. If set to null, all alerts from all streams are included in the entropy calculations.

Type: String

Required: Yes

Default: null

priming_stream_from_topic

If set to true, Moogfarmd extracts the priming stream name from the event's stream. If set to false, Moogfarmd uses the stream configured in priming_stream_name.

Type: Boolean

Required: Yes

Default: false

Alert Builder Fields and Requirements

Alert and Event Field Reference

This is a reference guide for alert and event fields, input types, field descriptions and output examples.

Field	Type	Description	Example Output
active_situations	Array	Situation IDs of any Situations the alert is associated with.	1, 6, 8
agent_host	Text	Host machine or physical location of the agent that created the event.	OEM Monitor 1
agent_name	Text	Name of the agent that created the event.	NAGIOS SOCKET
agent_location	Text	Host machine or physical location of the agent that created the event.	London Data Centre (51.4167,-0.2833)
agent_time	Integer	Timestamp of when the event occurred in epoch time. Use $moog_now in the mapping to set agent time to the time the event arrived at Cisco Crosswork Situation Manager.	1516183437
alert_id	Integer	Internal identifier generated by Cisco Crosswork Situation Manager.	101
class	Text	Level of classification for an event. This follows the hierarchy; class then type.	CISCO-IF-Extension-MIB
count	Integer	Number of events in the alert.	2
custom_info	Text	Custom information added as a JSON encoded string.	custom_info.myNodeList=[ "node1" , "node2" , "node3" ]
description	Text	Text description of the alert.	Network Interface (ifIndex = 512479388 ) Up (ifEntry.52683483)
entropy	Integer	Measure of uncertainty of an outcome between 0 and 1 (0 meaning very certain and 1 meaning very uncertain).	0.4
external_id	Integer	Unique identifier from the event source.	7622183
first_event_time	Integer	Earliest event time for the alert. This is calculated from the agent_time of the events that constitute the alert.	14:08:14 16/01/2018
host	Text	Name of the source machine that generated the event.	OEM Server 2
internal_last_event_time	Integer	Time that the latest event for the Alert was received by the Moog server.	10:24:03 19/01/2018
last_change	Integer	Time that the alert was last updated in the Cisco Crosswork Situation Manager UI.	12:38:06 19/01/2018
last_event_time	Integer	Latest event time for the alert. This is calculated from the agent_timeof the events that constitute the alert.	10:24:03 19/01/2018
manager	Text	General identifier of the event generator or intermediary.	NAGIOS, SCOM.
owned_by	Text	Alert owner's username.	John Smith
severity	Integer	Severity level of the alert between 0 and 5.	4
significance	Integer	Relative significance of an alert is calculated based on its entropy. See /document/preview/108771#UUIDb247c4a60150eaf02c5e8807ee4ac21eGlossary for more information.	3
situations	Array	Any situations the alert is associated with, including those that have been resolved or closed.	24, 01
source	Text	Name of the source machine that generated the event. If there is no source machine or application, the source is the name of the instance (database name, cluster node, container name).	A hostname or fully qualified domain name (FQDN).
source_id	Text	Identifier for the source machine that generated the event.	5dc68d65-532c-4918-be12-21e1cbcf7af2
status	Text	Status of the alert.	Assigned
type	Text	Level of classification for an event. This follows the hierarchy; class then type.	CISCO-IF-Extension-MIB Notification

Event and Alert Field Best Practice

This best practice is an attempt to offer consistency and reuse of configurations including the mapping from a source to an inbound event. The fields exposed in the alert/event are:

	Field	Required	Data Type	Size	Description	Example	Comment
1	signature	Yes	VARBINARY(binary)	767	This is a special attribute used to determine when Cisco Crosswork Situation Manager deduplicates events into Alerts. It can be any combination of one or more of the attributes listed below To be constructed as a subset of events from a source, also see existing guidance Constructed fields should be separated by “::” avoiding any possible issues with concatenation providing misleading results. e.g. NodeA event id 12 would concatenate as NodeA12, which would be the same as NodeA1 event 2. NodeA::12 and NodeA1::2 would therefore differentiate Signatures do not need to be human readable, so clarity isn’t a concern. If length is becoming an issue - remove whitespace or other extraneous characters (via a lambot)	host1::nagios::cpu
2	alert_id	Yes	BIGINT(binary)	20	An auto-assigned incremental number. Internally generated DO NOT CHANGE
3	source_id	Yes	TEXT(utf8)	65535	Source and Source_ID refer to the generating source of the event, primarily focused on the host environment. The Source should be any unique human readable name (FQDN, Hostname, etc) and the source_id should be any identifier for the source machine generated ( IP, MAC, CI Number, etc.) If the event has no machine identification such as Application or other software generated events, then the Source should be some unique identifier of the instance (database name, cluster node, container name etc.). Again source_id should be any other unique identifier that is available (container UUID, cluster node UUID etc.) This attribute can be used for any additional identification attribute of the CI	192.168.1.107
4	external_id	No	TEXT(utf8)	65535	Any unique identifier provided in the source event (event ID, Incident ID etc.) This is typically set to the CI's ID in the CMDB, or where the event is emitted from an underlying element management system, and may hold the unique source event identifier	12345	Returns Null if blank
5	manager	No	TEXT(utf8)	65535	A general identifier of the event generator or intermediary (NAGIOS, SCOM, etc.) In hub-and-spoke and/or relay architectures this typically is the name of the agent manager that pre-aggregates events prior to sending to Cisco Crosswork Situation Manager. For example, there may be an BMC Patrol manager that manages all San Francisco data center alerts. This field is also sometimes used simply to track the name of the Cisco Crosswork Situation Manager LAM that received the alerts in multi-LAM deployments	Nagios	Returns Null if blank
6	source	Yes	TEXT(utf8)	65535	Source and Source ID refer to the generating source of the event, primarily focused on the host environment. The source should be any unique human readable name (FQDN, Hostname, etc) and the source_id should be any identifier for the source machine generated ( IP, MAC, CI Number, etc.) If the event has no machine identification such as Application or other software generated events, then the Source should be some unique identifier of the instance (database name, cluster node, container name etc.). Again source_id should be any other unique identifier that is available (container UUID, cluster node UUID etc.)	host1
7	class	Yes	TEXT(utf8)	65535	Class and Type are generic classifications for the event in a hierarchy that allow you to maintain a simple event ontologies; class then type. (Disk space: free space, Memory: max used...total available, etc.)	cpu
8	agent	Yes	TEXT(utf8)	65535	The specific agent that created the event, (SCOM REST, NAGIOS SOCKET, SNMP TRAP NATIVE, etc.). This is typically the name of the agent that facilitates the event from the CI e.g. "nagios-agent-london-7" A simple way to provide this is in the lam.conf by setting the agent:name and then mapping $LamInstanceName to agent, this is the default { name: "agent",rule: "$LamInstanceName" },	Linux
9	agent_location	Yes	TEXT(utf8)	65535	This is typically the geographic location of the agent and/or CI such as "London". Should be used consistently for all sources, either as the host machine that the agent is executed from (BEM Server 1, OEM Monitor cluster, etc.) OR the physical location that the agent is executing (NYC Data Centre, Stuttgart Main Station, (51.407139, -0.307321) etc.)	New York, NY
10	agent_time	Yes			This is the timestamp in epoch seconds when the event occurred. This should be set across all event sources to provide a common time reference. Timezones should be nullified - all events should be presented in the same time context. If an event source does not provide a suitable time in the payload then use the ingestion time at the LAM. Note: polled event sources (rest_client_lam, SCOM, Netcool) may skew the event time in line with the poll cycle. If an event is being generated in a different timezone and is manipulated into the Cisco Crosswork Situation Manager server time - add the origin time to the custom_info for the event. This can be operationally useful. e.g. custom_info.originalEventTime : agent_time should be in epoch seconds - convert as necessary. Miscalculated event times will cause unpredictable results across the system. Also see 4.1.2 Release note. [MOOG-2278] - Enhanced Alert Times If the agent_time is not defined, it should be set to the current epoch time using Javascript functions such as: Math.round(Date.now() / 1000);
11	type	Yes	TEXT(utf8)	65535	Class and Type are generic classifications for the event in a hierarchy that allow you to maintain a simple event ontologies; class then type. (Disk space: free space, Memory: max used...total available, etc.)	DOWN
12	severity	Yes	INT(binary)	11	Standard 0-5 but be mindful of the significance across all event sources if possible. A low value event source could produce critical events that in the wider context would be considered minor Use the Cisco Crosswork Situation Manager LAM config file built in "sevMapper" to map your incoming severity values to a number between 0 and 5 : 0 = Clear 1 = Indeterminate 2 = Warning 3 = Minor 4 = Major 5 = Critical	5	0 clear - 5 critical
13	significance	No	INT(binary)	11	This value is calculated by Cisco Crosswork Situation Manager Events Analyser. Internally generated DO NOT CHANGE
14	count	No	INT(binary)	11	The reference count of deduplicated Events for each Alert. Internally generated DO NOT CHANGE
15	description	Yes	TEXT(utf8)	65535	The main text payload of the event. Add as much textual detail as possible. Remember a human operator will look at the detail and the entropy calculation works best with detailed narratives.	CPU Threshold exceeded: 99%
16	first_event_time	No	BIGINT(binary)	20	If you set agent_time in the LAM/LAMbot to the actual epoch seconds timestamp of each event, Cisco Crosswork Situation Manager will automatically keep track of the first and last occurrence of multiple instances of the same event. Internally generated DO NOT CHANGE
17	last_event_time	No	BIGINT(binary)	20
18	int_last_event_time	No	BIGINT(binary)	20	Internally generated DO NOT CHANGE	1411134582	Fromagent_time
19	last_state_change	No	BIGINT(binary)	20	Internally generated DO NOT CHANGE
20	state	No	INT(binary)	11	1 \| Opened 2 \| Unassigned 3 \| Assigned 4 \| Acknowledged 5 \| Unacknowledged 6 \| Active 7 \| Dormant 8 \| Resolved 9 \| Closed 10 \| SLA Exceeded Internally generated DO NOT CHANGE
21	owner	No	INT(binary)	11	Set when an operator right-clicks on an alert in the Cisco Crosswork Situation Manager UI and assigns ownership. Internally generated DO NOT CHANGE
22	entropy	No	DOUBLE(binary)	22	Internally generated DO NOT CHANGE
23	custom_info	No	TEXT(utf8)	65535	Custom_info is a special field that is the mechanism for extending the Cisco Crosswork Situation Manager alert schema. This is a JSON encoded string that should contain key value pairs for each data element not supplied in the initial event or having been enriched via alert transformation. Be consistent with key names so they can be used in Sigalisers and filters. Consider using a LAMBot module that sets a base set of custom_info across all lams - this provides a single point of administration for the customer. Care should be taken when setting custom_info in a LAM to ensure that it does not overwrite downstream additions (e.g. enrichment via a moobot) when the Event is de-duplicated. You can store simple or arbitrarily complex hierarchical JSON attributes in this field. They are basically serialized for use in the standard JSON.parse/stringify manner and Cisco Crosswork Situation Manager UI is written to display JSON hierarchies of any complexity in a tree-view format		Returns Null if blank

Maintenance Window Manager

The Maintenance Window Manager Moolet compares alerts against active maintenance windows. If the alerts match an active Maintenance Schedule filter, then they are not forwarded onto the next part of the chain. This prevents a Sigaliser Moolet clustering these alerts into Situations.

Configure Maintenance Window Manager

Edit the configuration file at $MOOGSOFT_HOME/config/moolets/maintenance_window_manager.conf.

Refer to Maintenance Window Manager Reference to see all available properties.

Example configuration

The following example demonstrates a simple Maintenance Window Manager configuration:

{
   name                     : "MaintenanceWindowManager",
   classname                : "CMaintenance",
   run_on_startup           : true,
   metric_path_moolet       : true,
   process_output_of        : "AlertBuilder",
   maintenance_status_field : "maintenance_status",
   maintenance_status_label : "In maintenance",
   update_captured_alerts   : true
}

Maintenance windows

You can use the Maintenance Schedule functionality to schedule outages when you do not want new Situations to be created from these alerts. You can configure the Maintenance Manager Moolet to ensure that alerts are not passed along to Sigalisers and clustered into Situations during that time period. You can set up maintenance windows using:

· UI: See Schedule Maintenance Downtime for more information on how to set up maintenance windows.Schedule Maintenance Downtime

· Graze API.Graze API

Updating captured alerts

In addition to implementing the maintenance windows, the Maintenance Window Manager Moolet updates the following custom_info fields in each alert affected by a maintenance window. Because the Maintenance Window Manager uses these custom_info fields within the alerts, Moobots must not overwrite these custom_info fields or completely empty the custom_info object within alerts.

Field	Description
custom_info.maintenance_status	Configurable text label. Set to "In maintenance" by default.
custom_info.maintenance_id	Numerical ID of the maintenance window that captured the alert.
custom_info.maintenance_name	Name of the maintenance window that captured the alert.
custom_info.forward_Alerts	Whether the alert is forwarded to clustering algorithms or not. Set to false by default.

Maintenance Window Manager Reference

This is a reference for the Maintenance Window Manager Moolet.

Cisco recommends you do not change any properties that are not in this reference guide.

You can change the behavior of the Maintenance Window Manager by editing the configuration properties in the $MOOGSOFT_HOME/config/moolets/maintenance_window_manager.conf configuration file. It contains the following properties:

name

Name of the Maintenance Window Manager Moolet. Do not change.

Type: String

Required: Yes

Default: "MaintenanceWindowManager"

classname

Moolet class name. Do not change.

Type: String

Required: Yes

Default: "CMaintenance"

run_on_startup

Determines whether the Maintenance Window Manager runs when Cisco Crosswork Situation Manager starts. By default, it is set to true, so that when Moogfarmd starts, it automatically creates an instance of the Maintenance Window Manager.

Type: Boolean

Required: Yes

Default: true

metric_path_moolet

Determines whether or not Cisco Crosswork Situation Manager factors the Maintenance Window Manager into the Event Processing metric for Self Monitoring.Self Monitoring

Type: Boolean

Required: Yes

Default: true

process_output_of

Defines the input source for the Maintenance Window Manager. This determines the Maintenance Window Manager's place in the alert processing workflow.

Type: List

Required: Yes

One of: AlertBuilder, AlertRulesEngine, Enricher

Default: "AlertBuilder"

maintenance_status_field

Name of the custom_info field or key used to indicate the alert's maintenance status.

Type: String

Required: Yes

Default: "maintenance_status"

maintenance_status_label

Value of the custom_info.maintenance_status field used to indicate that an alert is in maintenance.

Type: String

Required: Yes

Default: "In maintenance"

update_captured_alerts

If enabled, ensures the maintenance status of an alert is set to null once the Maintenance Window that captured it has expired. If disabled, the maintenance status field of a captured alert remains as the text value set in the maintenance_status_label property, unless that alert reoccurs when all custom_info maintenance fields are set to null.

Type: Boolean

Required: Yes

Default: true

It is possible to add a column in the alert view displaying the 'Maintenance Status' for each alert and the text visible in this column can be controlled by editing the maintenance_status_label in the MaintenanceWindowManager Moolet configuration in $MOOGSOFT_HOME/config/moolets/maintenance_window_manager.conf.

For the feature to function, you must place the Maintenance Window Manager Moolet before a Sigalising Moolet in a forwarding chain. It is also appropriate for you to locate it before the Alert Rules Engine in the processing chain.

Alert Rules Engine

The Alert Rules Engine uses business logic to process alerts based on certain conditions.

Note:

Cisco recommends using /document/preview/110725#UUID3bd5018041a19de941d95733dffc3e37 to enable custom logic and data processing for events, alerts and Situations. Consider carefully if you can implement your logic with the Workflow Engine before you implement and configure the Alert Rules Engine.Workflow Engine

The conditions that the Alert Rules Engine works with generally involve a time-based analysis so that it can process an event in the context of events that happen later. You can define rules in the Alert Rules Engine to hold alerts for a period of time, identify missing alerts or change the state of alerts. For example, common uses of the Alert Rules Engine include:

· Link Up-Link Down: Delays an alert to see if a link recovers.

· Heartbeat Monitor: Detects any missing network health signals.

· Closing Events: Closes events of a particular type or severity.

· Merging: Merges the state of two distinct alerts.

Configure Alert Rules Engine

Edit the configuration file at $MOOGSOFT_HOME/config/moolets/alert_rules_engine.conf.

Refer to Alert Rules Engine Reference to see all available properties.

Example Configuration

The following example demonstrates a simple Alert Rules Engine configuration:

{
    name               : "AlertRulesEngine",
    classname          : "CAlertRulesEngine",
    run_on_startup     : false,
    metric_path_moolet : true,
    moobot             : "AlertRulesEngine.js",
    process_output_of : "MaintenanceWindowManager"
}

Define Action States and Transitions

The Alert Rules Engine uses Action States and transitions and their properties, to process alerts through business logic defined in the AlertRulesEngine.js Moobot. After you have configured the Alert Rules Engine, set up Action States and transitions in the Cisco Crosswork Situation Manager UI under Settings > Automation:

· Action States: Determine the length of time Cisco Crosswork Situation Manager retains alerts before forwarding them to a Sigaliser or closing them.

· Transitions: Defines the set of conditions an alert must meet before it moves from one state to another in the Alert Rules Engine. Higher priority transitions take precedence over those with lower priorities.

See Action States and Transitions for further information on how to define them and the properties available.Action StatesTransitions

The initial state for all alerts is the 'Ground' state. After an alert enters 'Ground' state, the Alert Rules Engine transitions it to another state or forwards it to a Sigaliser. If the Action State has a 'Remember Alerts For' set to a positive number, the Alert Rules Engine retains an alert in that state for this period of time.

If you enable 'Cascade on Expiry' and nothing happens to an alert within that period, the Alert Rules Engine returns it to 'Ground' state before forwarding it to a Sigaliser. This is because the 'Ground' state has “Forward Alerts" enabled. If an alert does not match any transitions, the Alert Rules Engine does not return it to 'Ground' state and it is closed.

Note:

Action States are not enabled until you have defined a transition.

Alert Rules Engine Examples

The Alert Rules Engine can be set up to process Link Up-Link Down events. It can also be set up to act as a Heartbeat Monitor.

Link Up-Link Down Example

This example demonstrates how to configure the Alert Rules Engine so that when a Link Down alert arrives at Moogfarmd, the Alert Rules Engine holds it for a period of time to provide an opportunity for the Link Up alert to arrive. If nothing arrives, the Alert Rules Engine forwards it to a Sigaliser.

If the Link Up alert arrives, the Alert Rules Engine closes and discards both alerts without sending anything to the Sigaliser. This ensures that neither the Link Down nor the Link Up alert appear in Situations.

Related image, diagram or screenshot

To try out this example, set up the following:

1. Create three Action States: 'Ground' (default), 'Link Up' and 'Link Down'.

2. Create two transitions: 'Link Down Transition' and 'Link Up Transition'.

In this scenario, if a 'Link Down' alert arrives at the Alert Rules Engine and no 'Link Up' alert arrives within 120 seconds, the 'Link Down' alert returns 'Ground State' and the Alert Rules Engine passes it to a Sigaliser.

Heartbeat Monitor

You can configure the Alert Rules Engine Moolet in Cisco Crosswork Situation Manager to detect missing heartbeat events from monitoring tools such as CA Spectrum and Microsoft SCOM. Both of these tools send regular heartbeats to indicate normal operation.

After you configure the Alert Rules Engine, Cisco Crosswork Situation Manager creates a Situation when an event source does not send a heartbeat after a given time period. The Alert Rules Engine holds each heartbeat alert for a period of time, subsequent alerts from the same heartbeat source reset the timer. If the timer expires, a heartbeat has been missed and the alert is forwarded to a Sigaliser (clustering algorithm).

Before You Begin

Before you set up the heartbeat monitor in Alert Rules Engine, ensure you have met the following requirements:

· You have an understanding of Alert Rules Engine, Action States and transitions. See the Alert Rules Engine Moolet, Action States and Transitions for further details.Action StatesTransitions

· You can identify heartbeat alerts in the integration by description, class or another configurable field. These must be specific, regular events that arrive at consistent intervals to indicate normal operation. If these are not available, the Heartbeat Monitor will not work.

· You have edited the alerts so they contain the same attribute, via the integration source or through enrichment. In the example below, 'type' is 'heartbeat' in the Alert Rules Engine trigger filter and 'class' is 'heartbeat' in the Cookbook Recipe trigger filter.

Create a Heartbeat Monitor

To create a heartbeat monitor in Alert Rules Engine, follow these steps:

1. Edit $MOOGSOFT_HOME/bots/moobots/AlertRulesEngine.js and add the heartBeatSeverity exit action. This function changes the alert severity to critical and ensures alerts that are closed are not forwarded to the Cookbook. See Status ID Reference for a list of status IDs.

2. Navigate to Settings > Action States in the Cisco Crosswork Situation Manager UI. Create a new Action State called "Heartbeat" as follows:Action States

Setting Name	Input	Value
Name	String	Heartbeat
Remember alerts for	Integer (seconds)	30 *
Cascade on expiry	Boolean	True
Exit Action	String	heartBeatSeverity

Warning

* The Remember alerts for setting is the timer. Set this to two or three times your heartbeart interval time.

3. Go to Settings > Transitions in the Cisco Crosswork Situation Manager UI. Set up a transition to move your heartbeat alerts to the 'Heartbeat' State. Configure the settings as follows:Transitions

Setting Name	Value
Name	Heartbeat
Priority	10
Active	True
Trigger Filter	(type = "heartbeat") AND ((((agent = "SPECTRUM") OR (manager= "SCOM")) OR (agent = "MONITOR1")) OR (agent = "MONITOR2"))
Start State	Ground
End State	Heartbeat

Edit the 'Trigger Filter' to meet your requirements. In this example, the transition is triggered by alerts with the type of 'heartbeat' and that come from either 'SPECTRUM' or 'SCOM' or 'MONITOR1' or 'MONITOR2':

Related image, diagram or screenshot

4. Ensure Alert Rules Engine is enabled. To do this, edit the $MOOGSOFT_HOME/config/moolets/alert_rules_engine.conf file and set run_on_startup to true.

5. Create a heartbeat.conf configuration file in $MOOGSOFT_HOME/config/moolets to add a Heartbeat Cookbook for heartbeat alerts. This only works with these alerts:

     # Moolet
    name:"HeartbeatCookBook",
    classname:"CCookbook",
    run_on_startup:true,
    metric_path_moolet : true,
    moobot:"Cookbook.js",
    process_output_of:"[]",
    # Algorithm
    membership_limit:5,
    scale_by_severity:false,
    entropy_threshold:0.0,
    single_recipe_matching:false,
    recipes:[
        # Any heartbeat class for the same agent.
        {
            chef:"CValueRecipe",
            name:"ScomHeartbeatErrors",
            description:"SCOM Heartbeat: Missing heartbeat",
            recipe_alert_threshold:0,
            exclusion:"state = 9",
            trigger:"class = 'heartbeat' AND agent = 'SCOM'",
            rate:0,
            # Given in events per minute
            min_sample_size:5,
          max_sample_size:10,
            matcher:{
                components:[
                    {
                        name:"agent",
                        similarity:1.0
                    }
                ]
            }
        },
        {
            chef:"CValueRecipe",
            name:"ScomHeartbeatChange",
            description:"SCOM Heartbeat: Cluster host change",
            recipe_alert_threshold:0,
            exclusion:"state = 9",
            trigger:"class = 'heartbeatRoleChange' AND agent = 'SCOM'",
            rate:0,
            # Given in events per minute
            min_sample_size:5,
            max_sample_size:10,
            matcher:{
                components:[
                    {
                        name:"agent",
                        similarity:1.0
                    }
                ]
            }
        }
    ],
    cook_for:20000
}

Save heartbeat.conf.

6. Edit the Moogfarmd configuration file $MOOGSOFT_HOME/config/moog_farmd.conf to add a new merge group that references the HeartBeatCookbook Moolet. Configure this merge group to have an alert_threshold of 1 to allow a single alert to create a Situation (by default, a minimum of 2 alerts are required to create a Situation):

     merge_groups:
            [
               {
                   name: "Heartbeat",
                   moolets: ["HeartbeatCookBook"],
                   alert_threshold : 1,
                   sig_similarity_limit : 1
               }
            ],

7. Include the Moolet configuration by adding the following in $MOOGSOFT_HOME/config/moog_farmd.conf:

     {
            include : "heartbeat.conf"
          },

Save the changes to moog_farmd.conf.|

8. Restart Moogfarmd:

service moogfarmd restart

After the heartbeat monitor configuration is complete, heartbeat alerts should start to arrive in Cisco Crosswork Situation Manager.

Heartbeat Monitor Process

The process flow for a heartbeat alert is as follows:

· Heartbeat alert arrives at the Alert Rules Engine.

· The alert is transitioned from 'Ground' to 'Heartbeat' action state and starts the timer.

· The alert sits in the 'Heartbeat' state waiting for the timer to expire.

· Any subsequent heartbeat alert resets the timer.

· If the timer expires the exit action changes the alert severity to '5' (critical) and cascades it to 'Ground' state.

· Any subsequent heartbeat updates the severity to '0' (clear) and restarts the timer.

· You could also add an entry action to close any missed heartbeat situations the event is part of.

This example also updates the alerts with the times of the missing heartbeats for an easy audit trail.

Status ID Reference

The status of alerts and Situations is determined by their status ID. These statuses are used within the Heartbeat Monitor.

The different status_id values are as follows:

Status ID	Name
1	Opened
2	Unassigned
3	Assigned
4	Acknowledged
5	Unacknowledged
6	Active
7	Dormant
8	Resolved
9	Closed
10	SLA Exceeded

Alert Rules Engine Reference

This is a reference for the Alert Rules EngineMoolet.

Cisco recommends you do not change any properties that are not in this reference guide.

You can change the behavior of the Alert Rules Engine by editing the configuration properties in the $MOOGSOFT_HOME/config/moolets/alert_rules_engine.conf configuration file. It contains the following properties:

name

Name of the Alert Rules Engine Moolet. Do not change.

Type: String

Required: Yes

Default: "AlertRulesEngine"

classname

Moolet class name. Do not change.

Type: String

Required: Yes

Default: "CAlertRulesEngine"

run_on_startup

Determines whether the Alert Rules Engine runs when Cisco Crosswork Situation Manager starts. By default, it is set to false, so it does not start when Moogfarmd starts. You can change this property to true so that, when Moogfarmd starts, it automatically creates an instance of the Alert Rules Engine.

Type: Boolean

Required: Yes

Default: false

metric_path_moolet

Determines whether or not Cisco Crosswork Situation Manager includes the Alert Rules Engine in the Event Processing metric for Self Monitoring.Self Monitoring

Type: Boolean

Required: Yes

Default: true

moobot

Specifies a JavaScript file found in $MOOGSOFT_HOME/moobots, which defines the Alert Rules Engine Moobot. The default, AlertRulesEngine.js, provides the standard modules. You can customize it to meet your needs.

Type: String

Required: Yes

Default: "AlertRulesEngine.js"

mooms_event_handler

Determines whether or not the Alert Rules Engine listens for messages on the message bus. If set to true, the Alert Rules Engine processes messages on the Alerts topic on the message bus. This property should not be included in the configuration file, or should be commented out, if the process_output_of property is defined.

Type: Boolean

Required: No

Default: false

process_output_of

Defines the input source for the Alert Rules Engine. This determines the Alert Rules Engine's place in the alert processing workflow. If this property is defined, the mooms_event_handler property should be omitted or commented out in the configuration file.

Type: List

Required: No

One of: AlertBuilder, MaintenanceWindowManager, Enricher

Default: "MaintenanceWindowManager"

Empty Moolet

The Empty Moolet enables Cisco Crosswork Situation Manager integrators to intercept and handle Message Bus events without impacting upon the existing alert flow logic and processing. This provides a mechanism for you to implement your own alert processing rules. The Empty Moolet can also be used to provide general augmentation of alert and Situation details, for example, enrichment.

An Empty Moolet can be passed an alert or a Situation by one of the following mechanisms:

· Process output of: The Empty Moolet exists in the alert processing chain.

· Event handler: The Empty Moolet listens for specific message types on the bus.

· Direct forwarding: The Empty Moolet is handed an object by another Moolet, for example, Moolet A forwards an alert to Moolet B.

A single Empty Moolet uses one or more of these mechanisms.

Configure Empty Moolet

The Empty Moolet takes messages off the Message Bus according to message type and passes them to a Moolet. The configuration includes the message types to register interest for and the name of the Moolet to pass them to. For example, to integrate with an incident management system such as ACMEIncidentManager, the Empty Moolet must:

· Listen to NewThreadEntry events (the topic on Message Bus is /sig/thread/entry/new) and SigStatus events (the topic on Message Bus is /sigs/status topic).

· Interrogate the events to select only those in which the incident management system has registered an interest via the Graze API addSigCorrelationInfo request.

· Filter out those events which were originated by the incident management system via the Graze API to avoid sending duplicate information.

· Extract relevant information from the event including the incident management system entity reference.

· Send the information to the incident management system via the REST.V2 Moobot module which supports the sending of simple RESTful POST requests using basic HTTP authentication.REST.V2

The following example demonstrates an Empty Moolet configuration for this scenario:

{
        name                    : "ACMEIncidentManager",
        classname               : "CEmptyMoolet",
        run_on_startup : true,
        moobot                  : "ACMEIntegration.js",
        event_handlers : [
                "NewThreadEntry".
                "SigStatus"
        ]
}

This example shows one way of integrating Cisco Crosswork Situation Manager with another system. Each integration is dependent upon the individual use cases and systems being integrated.

See Alert Manager for a further example of an Empty Moolet configuration.

Note:

Not all event handlers are required for every integration. Only specify required handlers.

Customize Empty Moolet

To invoke custom javascript for a particular set of actions related to Situations, you can leverage the Empty Moolet to listen for these actions and use the data within the Situations involved. For example, when a Situation is closed you may want to notify an external entity via the REST.V2 module.REST.V2

Edit the configuration file moog_farmd.conf to associate the CustomTaskRunner Moobot with the Empty Moolet, and listen for SigAction events:

{
    name               : "CustomTaskRunner",
    classname          : "CEmptyMoolet",
    run_on_startup     : true,
    metric_path_moolet : false,
    moobot             : "CustomTaskRunner.js",
    event_handlers     : [
        "SigAction"
    ]
}

This is an example of Moobot code that runs a function when a supported Situation action occurs in Cisco Crosswork Situation Manager:

CustomTaskRunner.js

The urlToolName must match the name of the Situation URL tool. The Situation ID is available in the event payload, because the tool is run in the context of a particular Situation.

Alert Manager

The Alert Manager uses the Empty Moolet to enable Cisco Crosswork Situation Manager administrators or implementers to incorporate additional alert processing not handled by the Alert Builder, Maintenance Window Manager or Alert Rules Engine. You can use the Alert Manager in standalone mode or as part of the alert processing workflow.

Configure Alert Manager

Edit the configuration file at $MOOGSOFT_HOME/config/moolets/alert_manager.conf.

See /document/preview/11749#UUIDb629fde3285184a3f889b16159903705 for a full description of all properties. Some properties in the file are commented out by default.Empty Moolet Reference

You can use the following mechanisms to determine Alert Manager behavior:

· If standalone_moolet = true: The Alert Manager picks up alerts, specified by the event_handlers property, on the Message Bus and processes them.

· If you set process_output_of to Maintenance Window Manager or Alert Rules Engine: The Alert Manager uses the output of that component.

Example Configuration

The default configuration file contains an example implementation of the Empty Moolet functionality in the form of the Alert Manager Moolet. For example:

{
   name              : "AlertMgr",
   classname         : "CEmptyMoolet",
   run_on_startup    : false,
   metric_path_moolet: false,
   moobot            : "AlertMgr.js",
   standalone_moolet : true,
   # Listens for alerts events (on the /alerts topics)
   event_handlers    : [
     "AlertClose",
     "AlertUpdate",
     "Alert"
   ]
}

Alert Manager Moobot

Cisco provides a Moobot for the Alert Manager Moolet named AlertMgr.js. An example use case for this Moolet is to enable a specific action on different alert types. For example, to update a Situation's services when an alert is updated which contains certain attributes.

Empty Moolet

For further information on customizing Cisco Crosswork Situation Manager using the Empty Moolet, see Empty Moolet.