The COS service is implemented through an instance of the application instance controller (AIC). Each application instance represents a service instance. Some earlier COS releases supported a single service instance with one endpoint, one cluster, and one redundancy policy. COS Release 3.18.1 deployments can support multiple COS clusters, and each cluster has its own asset redundancy policy.
The Cisco Cloud Object Store (COS) > COS Service Status page of the V2PC GUI reports the status of each cluster. The values for Storage Status, Disk Status, Interface Status, Service Status, and Fault Status can be reported as one of the following:
On this page, you can drill down through a COS cluster to view the status of its individual COS nodes. Drilling down to each node reveals the status of individual node disks, interfaces, and services, and displays any active alarms for the node.
Note COS Release 3.18.1 supports resiliency status monitoring for two resiliency parameters per COS node, Local Erasure Coding (LEC, if enabled) and GOIDS.
A COS node is in service if both the associated COS application instance and the cluster to which it belongs are in Enabled state. The V2PC GUI displays the status of each node that is in service and part of a COS cluster. This status is updated once per minute, or when a fault is detected.
A reported fault raises an alarm or an event (or both), which is displayed in the V2PC GUI. If the fault is serious, the service interfaces for that COS node are removed from the DNS.
When the fault is no longer present, the service interfaces are replaced in the DNS and the node returns to normal service.
To view a summary of node usage and alarms (if any) using the V2PC GUI, open the GUI as described in Accessing the V2PC GUI and navigate to Cisco Cloud Object Store (COS) > COS Service Status.
COS Release 3.18.1 supports filtering the COS service status view by COS Cluster (default) or by Resiliency Group, which shows the COS nodes grouped by resiliency group. You can also choose to view Metadata Cluster status.
Figure 3-1 V2PC GUI, COS Service Status Page
The COS Service Status page lists the service instances and displays the associated node usage along with any alarms. This page displays the following information:
To view the status of the overall deployment from the V2PC GUI, open the GUI as described in Accessing the V2PC GUI and navigate to Dashboard > System Overview. The System Overview page offers information about Bandwidth (Tx/Rx) and Session (Tx/Rx) Storage usage.
The COS Alarms & Events page lists significant COS-related system events and provides details for user evaluation. To view this page in the V2PC GUI, open the GUI as described in Accessing the V2PC GUI and navigate to Dashboard > Alarms & Events.
All the events for the node are listed with the oldest event first. Events belong to one of the following levels of severity:
The COS AIC reports alarms and events to the V2PC GUI. The AIC generates alarms and events based on both GUI REST transactions (user input) and AIC client-generated status notifications.
|
|
|
|
---|---|---|---|
The COS AIC client generates events pertaining to storage (disk), network (interface), and service (process) for each node. These events are generated only if the AIC client monitoring is enabled. For more information on this monitoring activity, see Viewing Deployment Status.
The events generated by a COS AIC client are listed in Table 3-5 .
|
|
|
|
|
---|---|---|---|---|
The COS AIC server generates the events listed in Table 3-6 .
|
|
|
|
|
---|---|---|---|---|
The COS Services Statistics page provides a graphical summary of the status and performance of the node infrastructure. The displays update every 15 minutes to track changes in key system states over time.
To view the COS Statistics page in the V2PC GUI, open the GUI as described in Accessing the V2PC GUI and navigate to Cisco Cloud Object Store (COS) > COS Services Statistics.
Figure 3-2 V2PC GUI, COS Service Statistics Page
This page displays the following information:
The COS AIC client running on a COS node periodically monitors the disks, interfaces, and services (processes) of that node and posts the data to the DocServer as a COS-specific document.
The AIC client begins the monitoring activity when a node is configured and added to a COS cluster. As long as the node is running and is part of a COS cluster, monitoring occurs once every 10 seconds.
The AIC client can monitor and report storage (disk) state and statistics only if the CServer is running on the node. The following information is reported for each disk:
The client also reports the total storage space on all disks and the total storage space currently in use.
For each interface, the AIC client reports the interface state and the transmit and receive statistics. The client can monitor and report the state and statistics of CServer interfaces only when the CServer is running on the node.
The CMC AIC client running on a CMC node periodically monitors the metadata, log storage, interfaces, and services (processes) of that node and posts the data to the DocServer as a node-specific document.
The CMC AIC client begins monitoring activity when a node is configured and added to a CMC cluster. As long as the node is running and is part of a CMC cluster, monitoring occurs once every 2 seconds. Status is reported once every hour unless there is a significant change in state; for example, if an interface or service changes state (down or up) or if storage changes by more than 1%.
The CMC AIC Client monitors and reports the usage information (path, total MB and used MB) for Metadata Storage, Metadata Commit Log and CMC Application Log partitions on the CMC node.
The CMC AIC client reports the interface speed and state for each CMC interface.
The CMC AIC Client monitors the following services: Cassandra, NTP Daemon, and Consul Agent.
If one or more COS nodes in a cluster are not generating any alarms, events, or statistics, perform the following steps to ensure that monitoring is configured and working correctly.
Perform these steps for each COS node attached to the cluster in V2PC master.
Step 1 To confirm that the Sensu client is running on the COS node, connect to the node using SSH, type the command service sensu-client status, and check the response to see if the client is running. If not, type sensu-client start to start the service.
Step 2 To confirm that the Sensu configurations are present on the COS node, SSH into the node, type cd /etc/sensu/conf.d, and check that the following files are present and configured correctly:
Note If helpful, compare the contents of each file with those on another known working COS node.
Additionally, check the plugins directory (cd /etc/sensu/plugins) and confirm that metrics-cos-nodes.js is present.
Step 3 To confirm that the sensu-service log is present on the COS node, SSH into the node and type tail –f /var/log/sensu/sensu-client.log. Sensu checks for this information every 15 minutes.
Step 4 To confirm that the COS node statistics document is present, SSH into the node, type cd /tmp, then type ls -al and check the timestamp on the aic_cosnodestats.json file. This file should update every 15 minutes. If the file is present, type cat /tmp/aic_cosnodestats.json and confirm that it is not empty.
Step 5 To confirm that the rabbitmq messaging file on the COS node can be accessed, SSH into the node and type cat /etc/sensu/conf.d/rabbitmq.json.
Step 6 Try to ping the host from the COS node to confirm that it can be reached.
Note It is normal in an HA environment for the host ping to return different IP addresses.
An HA environment can have multiple Sensu masters. Perform the following steps to check each master:
Step 1 Connect via SSH to the Sensu master that you are accessing using the V2PC GUI.
Step 2 Type consul members to list all of the active masters in the HA environment.
Step 3 Check the conf.d directory (cd /etc/sensu/conf.d) to see if handler-metrics-cos-nodes-influxdb.json is present. If not, copy this file from another working master and place it in the conf.d directory.
Step 4 Open influxdb.json and confirm that it has the configuration information needed to access influxdb.
Step 5 Try to ping the influxdb host to confirm that it can be reached.
Note It is normal in an HA environment for the host ping to return different IP addresses.
Step 6 Check the handlers directory (cd /etc/sensu/handlers) to see if metrics-cos-nodes-influxdb.js is present. If not, copy this file from another working master and place it in the handlers directory.
Step 7 Also check to see if the node_modules directory is present and contains influx. If not, copy this directory from another working master.
Step 8 Type systemctl status sensu-server to confirm that the Sensu server is running.
Step 9 Type tail –f /var/log/sensu/sensu-server.log to check the Sensu server logs.