Cisco MSX Security Certificates

MSX uses multiple certificates to enable the secure functioning of a diverse set of databases and other system infrastructures. These certificates have a limited lifespan and must be periodically renewed, involving a time-consuming manual process that was susceptible to missing the replacement of certificates, which would cause system downtime. To improve the situation, the MSX team has created functionality that automates the management and replacement of a large subset of existing certificates. This chapter discusses that functionality.

Security Certificate Categories

MSX certificates can be grouped into five broad categories:

  1. Bootstrap Certificates: Two bootstrap certificates (one for Consul and one for Vault) are created by the installer and must be managed by MSX or customer-facing operations teams.

  2. Kubernetes-managed Certificates: These certificates are created by kubeadm at install time and should be managed using that tool afterwards. The certificates are automatically renewed during an MSX upgrade.

  3. Vault-managed Certificates: These certificates are checked out from the Vault PKI and automatically renewed by a Vault-agent sidecar, a host process, or internal code. Vault-managed certificates include:

    1. MinIO

    2. etcd

    3. Docker Registry

    4. Kubelet

    5. iPnP and NGINX

    6. CockroachDB

    7. Action Orchestrator

    8. CSR Hub and CSR VPN

    9. Cassandra

    10. Kafka

    11. Calico*

    12. Zookeeper*

    13. Redis*

    14. Elasticsearch*

    15. Kibana*

    * These certificates currently do not have an MSX mechanism for rotation.

  4. Certificate Authority (CA) Rotation: These certificates have a 5-year lifespan. An implementation to rotate these certificates is being considered for future development.

  5. JWT-signing Keypair: JSON Web Tokens implement a secure method of transmitting information in the form of JSON objects between parties. These certificates will be managed by MSX or customer-facing operations teams.

Managing MSX Security Certificates

This section describes how to manage the playbooks that are available for certificate management.

Bootstrap

Bootstrap services and related certificates are not managed by MSX. Operations teams must manage these certificates by monitoring their status and rotating them prior to expiry. Bootstrap certificates can be rotated using the update-bootstrap-cert.yml playbook. The playbook will reissue Vault and Consul certificates and restart the services. There may be brief outages in Consul and Vault services due to the required restart operation, so this update should only be performed during scheduled maintenance windows.

To rotate the bootstrap certificates:

  1. Change to the Ansible directory.

    cd /msx-4.0.0/ansible
  2. Set the environment variable for the Ansible Vault password:

    export ANSIBLE_VAULT_PASSWORD_FILE=<vault_pwd_path>

    Where <vault_pwd_path>=/msx-4.0.0/ansible/vault/password.txt

  3. Rotate the bootstrap certificates.

    ansible-playbook update-certs-bootstrap.yml

Kubernetes Managed

Certificates that are managed by Kubernetes are automatically rotated during an MSX upgrade.

Vault Managed

Certificates will automatically be rotated as part of the upgrade. There may be brief outages in the services due to the required restart operation, so this update should only be performed during scheduled maintenance windows.

There are two types of vault-managed services:

  1. Fully managed services, which are restarted by an agent and require no operator invention. These include: MinIO, Docker Registry, kubelet, etcd, and CockroachDB.

  2. Operator-managed services, which need to be manually restarted by operators during the upgrade process. The services are Cassandra and Kafka.

Microservices

The JWT signing key used for microservices can be reset using the update-certs-jwt.yml playbook. Running this playbook will cause a microservice restart and result in a brief outage until all services are using the same key. You should schedule this update during a maintenance window.

To reset the JWT signing key:

  1. Change to the Ansible directory.

    cd /msx-4.0.0/ansible
  2. Set the environment variable for the Ansible Vault password:

    export ANSIBLE_VAULT_PASSWORD_FILE=<vault_pwd_path>
    Where
    <vault_pwd_path>=/msx-4.0.0/ansible/vault/password.txt
  3. Reset the JWT signing key.

    ansible-playbook update-certs-jwt.yml

Action Orchestrator

Action Orchestrator certificates can be reset using the ansible-playbook update-certs-ao.yml playbook. Running this playbook will cause a brief outage of AO due to required restarts. You should schedule this update during a maintenance window.

To reset the Action Orchestrator certificates:

  1. Change to the Ansible directory.

    cd /msx-4.0.0/ansible
  2. Set the environment variable for the Ansible Vault password:

    export ANSIBLE_VAULT_PASSWORD_FILE=<vault_pwd_path>

    Where

    <vault_pwd_path>=/msx-4.0.0/ansible/vault/password.txt
  3. Reset the Action Orchestrator certificates.

    ansible-playbook update-certs-ao.yml

Unmanaged

Certificates on the edge, such as NGINX and iPnP are unmanaged. These services will use a customer-provided certificate and the task of resetting them will need to be coordinated between the Operations team and the customer.

Viewing Certificate Expirations

You can view the expiry of kubeadm managed certificates, certificate files on the kubernetes master and worker nodes, AO certificates on the installer, and vault-managed certificates using the following command:
checks/check-certificate-expiry.yml 

The following example shows the command output:



Output Example (truncated):
...
Vault Managed Certificates on Kubernetes master\ 
kube-master-msx2-yvr-1 /etc/ssl/vms-certs/etcd.pem  Expires: Mar  9 00:19:29 2021 GMT - WARNING !!! Certificate expires in 28 days\ 
kube-master-msx2-yvr-1 /etc/etcd/etcd.pem  Expires: Mar  9 00:19:29 2021 GMT - WARNING !!! Certificate expires in 28 days\ 
... 
Certificates on Kubernetes master\ 
kube-master-msx2-yvr-1 /etc/ssl/vms-certs/consul.pem  Expires: Jan 30 20:03:00 2022 GMT - Expires in 355 days\ 
kube-master-msx2-yvr-1 /etc/consul/certs/consul.pem  Expires: Jan 30 20:03:00 2022 GMT - Expires in 355 days\ 
... 
Certificates on Kubernetes worker nodes\ 
kube-node-msx2-yvr-1 /etc/consul/certs/consul.pem  Expires: Jan 30 20:03:00 2022 GMT - Expires in 355 days\ 
kube-node-msx2-yvr-2 /etc/consul/certs/consul.pem  Expires: Jan 30 20:03:00 2022 GMT - Expires in 355 days\ 
... 
Certificates on Installer container\ 
/etc/ssl/vms-certs/ad-aws.pem  Expires: Jan 31 01:08:00 2022 GMT - Expires in 356 days\ 
/etc/ssl/vms-certs/ad-ccs.pem  Expires: Jan 31 01:08:00 2022 GMT - Expires in 356 days\ 
... 
 

Configure the certificates that you would like to display by editing group_vars/all/cert_variables.yml file. Ensure that you have added the certificate filename and location under the appropriate master, worker, or installer section.

The following example shows a sample cert_variables.yml file (abbreviated).


Output Example (truncated):

...
--- 
# Lists of Certificates to be scanned when running checks/check-certificate-expiry.yml 
# Add to or remove from these certificate lists as requirements change 
# Any certificates in the list that do not exist will be ignored 
#  
cert_vars: 
# Vault-agent magaged certificates on kubernetes masters 
  cert_master_vault: 
    - /etc/ssl/vms-certs/etcd.pem 
    - /etc/etcd/etcd.pem 
… 
# certificates on kubernetes masters 
  cert_master: 
    - /etc/ssl/vms-certs/consul.pem 
    - /etc/consul/certs/consul.pem 

# certificates on installer container 
  cert_installer: 
    - /etc/ssl/vms-certs/ad-aws.pem 
    - /etc/ssl/vms-certs/ad-ccs.pem 
… 
# certificates on kubernetes workers 
  cert_worker: 
    - /etc/consul/certs/consul.pem 
...

Vault Managed  Certificates

A vault-agent exists on all hosts in the system to manage certificates for two classifications of services:

  1. Services fully managed by Vault are MinIO, Kubelet, and etcd.

  2. Services that have their certificates managed by Vault, but that are not automatically restarted due to the risk of system disruptions. These certificates will be picked up during an upgrade or the Operations team can schedule a restart window at their convenience. The services are Cassandra, Kafka, and Calico.

The vault-managed certificate time to live is 8760 hours (365 days) and is configured by setting the vault_cert_expiry_hours variable in group_vars/all/deployment_vars.yml file. You can update the interval by running the upgrade-pki.yml playbook. This operation restarts MinIO, etcd, and kubelet and should be performed during a maintenance window.

Adding External Certificates to MSX

If you have an SD-WAN deployment with vManage connected, you must copy your external certificates and import them into the centralized MSX keystore. If you do not perform this procedure, then vManage will not function.

To add your certificates:

  1. Change to the Ansible directory.

    cd /msx-4.0.0/ansible
  2. Set the environment variable for the Ansible Vault password:

    export ANSIBLE_VAULT_PASSWORD_FILE=<vault_pwd_path>

    Where:

    <vault_pwd_path> = /msx-4.0.0/ansible/vault/password.txt
  3. Create the folder for the external certificates.

    mkdir external_certs
  4. Copy your external certificates into the folder.

    cp <path>/custom.pem /msx-4.0.0/ansible/external_certs
  5. Run this playbook as many times as necessary to import each certificate into the MSX keystore.

    ansible-playbook add-external-ca.yml -e "pem_file=<custom.pem> alias=<custom_alias>"

    Where:

    pem_file is the name of the external security certificate file.

    alias is a user-friendly name under which the certificate will be stored in the keystore. Each alias must be unique.

  6. Restart the Viptelabeat service.

    ansible -m shell -a "kubectl rollout restart statefulset viptelabeat -n vms" kube_master[0]