This section contains additional reference material for further understanding the COS system, and information on performing commonly executed tasks and system maintenance.
The COS service model is shown in the Unified Modeling Language (UML) diagram below:
A COS operator can assume one of the following roles:
COS Release 3.18.1 and its content are managed through Cisco Virtualized Video Processing Controller (V2PC). The V2PC GUI has pages that allow monitoring and updating many aspects of the deployment, including COS nodes and clusters.
This section describes COS related operations available from the V2PC GUI. For additional information, see the Cisco Virtualized Video Processing Controller User Guide for your V2PC release.
Step 1 Open a web browser and access the V2PC GUI at https://<v2pc-ip>:8443/.
Step 2 Log in to the V2PC GUI using the following credentials:
The V2PC GUI opens to the Dashboard page, System Statistics tab.
Figure A-2 V2PC GUI Dashboard, System Statistics Tab
The Dashboard section contains the following pages:
This Cisco Cloud Object Store (COS) section contains the following pages:
The following table identifies open network ports for COS nodes and the services that own these ports.
|
|
|
|
---|---|---|---|
It may be necessary to shut down or reboot a COS node for conditions such as routine maintenance. You can shut down a COS node by placing it in Maintenance mode from either the command line or the V2PC GUI. You can also reboot a COS node from the CLI.
Note ● You cannot reboot a COS node from the V2PC GUI.
To reboot a COS node, execute the reboot command from a terminal console or remote shell. The system will begin a shutdown phase to reboot.
To switch a COS node from Inservice to Maintenance mode or vice versa from the V2PC GUI:
Step 1 Open the GUI as described in Accessing the V2PC GUI and navigate to Cisco Cloud Object Store (COS) > COS Nodes.
Step 2 Locate the node in the COS Nodes list and click its Edit icon to open the Edit dialog for the node.
Step 3 In the Edit dialog, select Maintenance or Inservice as the new Admin State for the node.
Step 4 Click Save to save your changes and return to the COS Nodes page.
COS lets you decommission a node at the CServer level. Decommissioning tells CServer to copy the data objects of the node to other nodes in the cluster until the target number of mirror copies is reached. After the node is decommissioned, it can be removed from the cluster using either the V2PC GUI or the API.
Node decommissioning itself is currently a CLI-only operation. To decommission a node, run the script cserver-control.pl decommission, installed on the node at /opt/cisco/cos-aic-client/cserver-control.pl.
As decommissioning can take several hours, the CLI does not monitor the decommissioning process for completion. To check for completion, enter the command cserver-control.pl decommission --stats periodically until the response confirms that the operation is complete.
After decommissioning is complete, you can safely remove the node using the GUI or the API. For instructions on removing a node from a cluster using the GUI, see Node Decommissioning and Removal. For API information, see the Cisco Cloud Object Storage Release 3.18.1 API Guide.
Note ● A node cannot be decommissioned after it has been removed from a cluster using the GUI or API. So, you must decommission a node before removing it.
When you remove a node from a multi-node cluster through the GUI, the node is first decommissioned from the Cassandra database cluster, and then the Cassandra service and CServer are shut down. If you shut down the node before the Cassandra-level decommissioning completes, the node may continue to be considered part of the Cassandra cluster and listed in the nodetool status output of the remaining nodes, but now in down (DN) state, which prevents you from adding new nodes to the cluster.
To avoid this issue, we recommend opening the COS AIC Client log before removing the node through the GUI. Inspect the log periodically to confirm that Cassandra decommissioning is completed before shutting down the node.
To inspect the log for node decommissioning from the Cassandra cluster:
Step 1 Use the Linux tail command to print new lines being added to the COS AIC Client log, followed by the Linux grep command to search for db-remove:
Step 2 Remove the node using the GUI and inspect the log for db-remove:
Step 3 Inspect the log for Completed db-remove, which shows that the node has been removed from Cassandra cluster:
Step 4 To verify that CServer has also been shut down, inspect the log using tail (or cat) followed by grep for cserverControl-shutdown:
Step 5 To confirm completion of the removal process, inspect the log to ensure that no new messages are printed:
[root@Colusa-4T-72 ~]# tail -f /arroyo/log/cos-aic-client.log.20160506
Step 6 Run the command nodetool status cos on one of the remaining nodes in the cluster to confirm that the removed node is no longer listed as part of the cluster.
It may become necessary to reinstall a COS node in a cluster certain situations, such as:
If necessary, reinstall a COS node as follows:
Step 1 Log into the V2PC GUI as described in Accessing the V2PC GUI.
Step 2 From the V2PC GUI navigation menu, choose Cisco Cloud Object Store (COS) > COS Nodes.
Step 3 Locate the node to be removed and place the node in Maintenance mode as described in Switching Node Admin State from the GUI.
Step 4 Decommission the node and remove it from the cluster as described in Node Decommissioning and Removal.
Step 5 Is Linux still running on the node just removed?
Step 6 Perform a fresh installation of the node using the full COS ISO image as described in the appropriate section of Deploying COS.
Step 7 On the Cisco Cloud Object Store (COS) > COS Nodes page of the V2PC GUI, locate the node and add it to the desired cluster.
Note ● If you must reinstall a COS node immediately after removing it from a cluster, first verify that the node removal has completed in the associated Cassandra database cluster. To check the status of the Cassandra database cluster, execute nodetool status cos on the console of a remaining COS node. When removal has completed, the list will no longer contain the management IP address of the removed node.
There are four primary system services that provide COS functionality on a COS node: cassandra, cosd, cos_aicc, and cserver. These services can be manipulated using standard Linux system service tools.
To prevent these services from starting automatically on a COS node boot, execute the following commands as the root user from a shell on that node:
To enable automatic service loading on node boot, execute the following commands:
To view the current state of service loading, execute the following command:
To manually start the services, execute the following commands:
This section describes the response of the COS AIC server to changes in the states of disks, interfaces, and services on COS nodes. The COS AIC client and the Service Monitor convey the changes in state by sending appropriate events to the COS AIC server.
If a change is critical and indicates that the node cannot service requests, the AIC server ensures that the node interfaces are not part of the DNS so that service requests are not addressed to the node.
When the COS AIC server receives a status from the COS AIC client indicating the failure of one or more disks on a COS node, it will process that status as follows:
If the COS AIC server receives a status from the COS AIC client indicating that one or more of the following COS services is down, the Service Status for the node is set to Critical, the node is marked as down, an appropriate alarm is raised, and its service interfaces are removed from the DNS.
If the COS AIC server receives a status from the COS AIC client indicating that one or more of the following COS services is down, the Service Status for the node is set to Warning and an appropriate alarm is raised.
When the COS AIC server receives a status from the COS AIC client indicating that one or more service interfaces are not functional, those interfaces are removed from the DNS.
Additionally, if more than 50% of the nodes interfaces are reported as down, the node is considered non-operational and all interfaces for it are removed from DNS. An appropriate alarm is also raised, and the Interface Status is set to either Critical (for 50% or more down) or Warning (for less than 50% interfaces down).
The COS AIC server expects to receive a heartbeat message from the COS AIC client running on each node every 2 seconds. Abnormalities are processed as follows:
For instructions on replacing hard drives on the platforms that COS 3.18.1 supports, see the appropriate hardware installation guide.
Note Before replacing a hard drive that is not listed as sick, you must first logically remove the drive to stop any data transfer in progress and spin down the drive. To do this, execute the command cddm –r n, where n is the drive number. Do not proceed until a response confirms that it is safe to remove the drive.
Vds_log_rotate is a script-based version of the Linux tool: logrotate. This script manages the log files by rotating, compressing, and moving logs to an archive storage area. The script also deletes files located in the archive storage, area based on the configured settings. The script is provided within the vds_logrotate rpm and is run as a cron job.
Rotate, moves and compresses log files located within the /arroyo/log to /arroyo/log/archive, based on the configuration parameters.
Trim deletes (trims).tgz files located within the /arroyo/log/archive, based on the configuration parameters. These files are deleted for the following reasons:
1. The /arroyo/log partition is 20G. Because this partition is less than 30G and rotate$trim_max_size is 50%, the archive files are deleted when the used % is greater than 15G (.50 * 30G).
2. The /arroyo/log partition is 98G, the.tgz files are deleted until the % used is <= 86.24G (.88 * 98G).
The following are important configuration settings, which can be changed to decrease the size of the accumulated log files.
– Number of days that the log files stay in the /arroyo/log subdirectory before being moved to /arroyo/log/archive.
– Number of days the *.tgz files stay in the /arroyo/log/archive subdirectory. This time is also affected by the rotate$trim_max_size and the rotate$trim_hourly_max_size configuration settings.
– Default value: The default value is 50 if the /arroyo/log size is < 30GB, but 88 if it is larger. Archived log files are deleted if this value is exceeded.
– Default value: The default value is 75 if the /arroyo/log size is < 30GB, but 90 if it is larger. Archived log files are deleted if this value is exceeded.