Post Installation Tasks and Details for Software Agents

(Manual Installations Only) Update the User Configuration File

The following procedure is required only for installations involving all of the following:

  • Secure Workload SaaS, or on-premises clusters with multiple tenants (on-premises clusters that use only the default tenant do NOT need this procedure)

  • Manual installation

  • Linux or Windows platform

Agents require an activation key to register to the Secure Workload cluster. they require a cluster activation key. Additionally, they might need an HTTPS proxy to reach the cluster.


Note


In Windows Environment, you do not need to manually configure user.cfg, if activationkey and proxy options are used during manual installation.


Before installation, configure the required variables in the user configuration file:

Procedure


Step 1

To retrieve your activation key, navigate to Manage > Agents, click the Installer tab, click Manual Install using classic packaged installers, then click Agent Activation Key.

Step 2

Open the user.cfg file in the Secure Workload Agent installation folder. (Example: /usr/local/tet on Linux or C:\\Program Files\\Cisco Tetration on Windows). The file contains a list of variables in the form of “key=value”, one on each line.

Step 3

Add the activation key to the ACTIVATION_KEY variable. Example: ACTIVATION_KEY=7752163c635ef62e6568e9e852d07bd21bfd60d0

Step 4

If the agent requires an HTTPS proxy, add the http protocol proxy server and port using the HTTPS_PROXY variable. Example: HTTPS_PROXY=http://proxy.my-company.com:80


Other Agent-Like Tools

AnyConnect Agents

No Secure Wrokload agent is required for platforms supported by Cisco AnyConnect Secure Mobility agent with Network Visibility Module (NVM). AnyConnect connector registers these agents and exports flow observations, inventories, and labels to Secure Workload. For more information, see AnyConnect Connector.

For Windows, Mac, or Linux platforms, see Cisco AnyConnect Secure Mobility Client Data Sheet.

ISE Agents

A Secure Workload agent on the endpoint is not required for endpoints registered with Cisco Identity Service Engine (ISE). ISE connector collects metadata about endpoints from ISE through pxGrid service on ISE appliance. It registers the endpoints as ISE agents on Secure Workload and pushes labels for the inventories on these endpoints. For more information, see ISE Connector.

SPAN Agents

SPAN agents work with the ERSPAN connector. For more information, see ERSPAN Connector.

Third-Party and Additional Cisco Products

  • For integrations using external orchestrators configured in Secure Workload,

    see External Orchestrators in Secure Workload.

  • For integrations using connectors configured in Secure Workload.

    see What are Connectors.

Connectivity Information

In general, when the agent is installed on the workload, it makes several network connections to the back-end services hosted on the Secure Workload cluster. The number of connections will vary depending on the agent type and its functions.

The following table captures various permanent connections that are made by various agent types.

Table 1. Agent Connectivity

Agent type

Config server

Collectors

Enforcement backend

visibility (on-premises)

CFG-SERVER-IP:443

COLLECTOR-IP:5640

N/A

visibility (SaaS)

CFG-SERVER-IP:443

COLLECTOR-IP:443

N/A

enforcement

(on-premises)

CFG-SERVER-IP:443

COLLECTOR-IP:5640

ENFORCER-IP:5660

enforcement (SaaS)

CFG-SERVER-IP:443

COLLECTOR-IP:443

ENFORCER-IP:443

docker images

CFG-SERVER-IP:443

N/A

N/A

Legends:

  • CFG-SERVER-IP is the IP address of the config server.

  • COLLECTOR-IP is the IP address of the collector. Deep visibility and enforcement agents connect to all available collectors.

  • ENFORCER-IP is the IP address of the enforcement endpoint. The enforcement agent connects to only one of the available endpoints.

  • For Kubernetes/Openshift agent deployments, the installation script does not contain the agent software - Docker images containing the agent software are pulled from the Secure Workload cluster by every Kubernetes/Openshift node. These connections are established by the container run time image fetch component and directed at CFG-SERVER-IP:443.

Navigate to Platform > Cluster Configuration to know the config server IP and collector IP.

  • Sensor VIP is for the config server IP: The IP address that has been set up for the config server in this cluster.

  • External IPs are for collectors IPs and enforcer: If this is populated, when assigning external cluster IP addresses, the selection process is restricted to only IP addresses defined in this list, that are part of the external network.


Note


  • The Secure Workload agent always acts as a client to initiate the connections to the services hosted within the cluster, and never opens a connection as a server.

  • Agents, for which upgrade is supported, periodically perform HTTPS requests (port 443) to the cluster sensor VIP to query for available packages.

  • An agent can be located behind a NAT server.


Connections to the cluster might be denied if the workload is behind a firewall, or if the host firewall service is enabled. In such cases, administrators must create appropriate firewall policies to allow the connections.

Security Exclusions

Software agents continuously interact with the host operating system during their normal operations. This operation may cause other security applications installed on the host, such as antivirus, security agents, and others, to raise alarms or block the actions of Secure Workload agents. Therefore, to ensure that agents are installed successfully and are functioning, you must configure the necessary security exclusions on the security applications that are monitoring the host.

Table 2. Security Exclusions for Agent Directories

Host OS

Directories

AIX

/opt/cisco/tetration

Linux

/usr/local/tet or /opt/cisco/tetration or <user chosen inst dir>

/var/opt/cisco/secure-workload

Windows

C:\Program Files\Cisco Tetration

C:\ProgramData\Cisco Tetration

Solaris

/opt/cisco/secure-workload

Table 3. Security exclusions for Agent Processes

Host OS

Processes

AIX

csw-agent

tet-sensor

tet-enforcer

tet-main

Linux

csw-agent

tet-sensor

tet-enforcer

tet-main

enforcer

Windows

CswEngine.exe

TetEnfC.exe

Solaris

csw-agent

tet-sensor

tet-enforcer

tet-main

Table 4. Security Exclusions for Agent Actions

Host OS

Actions

AIX

Access /dev/bpf*, /dev/ipl, /dev/kmem

Invokes cfg_ipf, ipf, ippool, ipfstat lslpp, lsfilt, prtconf, uname, uncompress, oslevel

Scan /proc

Modifies /etc/security/audit/config and /etc/security/audit/objects and creates /etc/security/audit/config.backup and /etc/security/audit/objects.backup when the Forensics feature is enabled.

Linux

Invokes ip[6]tables-save, ip[6]tables-restore, rpm/dpkg, uname, unzip

Scan /proc, open netlink sockets

Windows

Access registry

Register to firewall events

Invokes c:\windows\system32\netsh.exe

Solaris 11.4

Invokes pkg, ps, smbios (x86 only), uname, unzip

Scan /proc

Creates /etc/audit/rules.d/taau.rules when Forensic is enabled

Solaris 10

pkgrm, pkgchk, pkgadd, ps

Scan/proc, prtconf, virtinfo(sparc only), svcadm, pfctl, uname, unzip

Creates /etc/audit/rules.d/taau.rules when Forensic is enabled

Table 5. Security Exclusions for Agents Scripts or Binaries Executions

Host OS

Invoked scripts/binaries

AIX

-

Linux

-

Windows

dmidecode.exe

npcap-installer.exe

sensortools.exe

signtool.exe

Solaris

-

Service Management of Agents

Software agents are deployed as a service in all supported platforms. This section describes methods to manage the services for various functions and platforms.


Note


Unless specified otherwise, all the commands in this section require root privileges on Linux or Unix, or administrative privileges on Windows to run.


View Detailed Agent Status in the Workload Profile

Procedure


Step 1

Follow the steps above to check Agent status.

Step 2

On the Enforcement Agents page, click Agent OS Distribution. Select an operating system and click filter image on the top-right corner of the box.

Step 3

On the Software Agent List page, agents with selected operating system Distribution is listed.

Step 4

Click on Agent for the agent details, and click IP address. On the Workload Profile page, you can view details of the Host Profile, Agent Profile and agent specific details, such as Bandwidth, Long- lived Processes, Packages, Process Snapshot, Configuration, Interfaces, Stats, Policies, Container Policies and so on.

Step 5

Click Config tab to see the configuration on the end-host.

Step 6

Click Policies tab to see the enforced policies on the end-host.

Figure 1. Workload Profile - Config
Workload Profile - Config
Figure 2. Workload Profile - Policies
Workload Profile - Policies

Note

 

Fetch All Stats is not supported on Windows agent hosts, which is used to provide statistics for individual policies.


Generate Agent Token

In the agent configuration profile, you can enable service protection to prevent uninstallation, disabling, and stopping Windows agent services. To perform any changes to the agents, you can disable this protection on the agent configuration profile. However, if you are unable to disable the protection because of connectivity issues, you can generate an agent token to disable the service protection on workload. The token is valid for 15 minutes.

Supported roles to generate and retrieve agent tokens:

  • Site administrators: For clusters or tenants.

  • Customer support: For tenants.

  • Agent installer: For agent-specific tokens.


Note


You can generate time-based agent tokens only for Windows OS-based software agents.


To generate and download agent tokens, perform these steps:

Procedure


Step 1

In the navigation pane, click Manage > Workloads > Agents > Agent List.

Based on your requirement, you can choose one of the agent token types—Cluster, tenant, or agent-specific. For the agent-specific token, go to Step 5.

Step 2

Click the menu icon and choose Agent Token.

Note

 

The Agent Token option is only visible for site administrators or customer support user roles.

Step 3

Select a token type:

  • Token For Cluster—This option is visible only to site administrators and the token is applicable for all the agents.
  • Token For Tenant—Applicable for the agents under a selected tenant.

Step 4

To download the token key, click Download Token.

Step 5

To view and download token key details of a specific agent:

  1. Go to the Agent List tab and click the required agent. Under Agent Details > Agent Token, you can view the token key and expiry details of the token.

  2. To download the agent-specific token, click Download Token.


What to do next

After downloading the agent token file, run the following command on the agent to disable service protection: "C:\Program Files\Cisco Tetration\TetSen.exe” -unprotect <token>, where token is the downloaded agent token.

After the service protection is disabled using a token, it may be automatically re-enabled when the service restarts and connects to the Secure Workload cluster.

Disable Enforcement on Workload

As a tenant owner, you can selectively disable enforcement on workloads while troubleshooting is in progress. Disabling the enforcement overrides the configuration for the workload, and you can revert to the initial enforcement state after the troubleshooting is complete.

To disable enforcement on workloads, perform the following procedure.

Procedure


Step 1

From the navigation pane, choose Manage > Workloads > Agents > Agent List.

Step 2

Click Enforcement adjacent to the agent you want to disable the enforcement for.

Figure 3. Disable Enforcement on Workload

Step 3

To disable the enforcement on the agent, under Agent Controls, click Disable Enforcement.

Step 4

Confirm the action in the dialog box that is displayed.

Step 5

(Optional)Similarly, if you want to re-enable enforcement on any agent, under Agent Controls, click Enable Enforcement and confirm the action in the dialog box that is displayed.


Host IP Address Change When Enforcement is Enabled

As a site admin, if you change the IP address of a host when enforcement is already enabled on an agent, there might be an impact if the host IP address is seen in the host firewall rules and catch all is set to deny.

In this scenario, perform the following steps to change the host IP address:

Procedure


Step 1

On the Secure Workload UI, Create a new Agent Configuration Profile with enforcement disabled.

Step 2

Create Intent with a list of hosts that require IP address changes, their old and new addresses.

Step 3

Apply the newly created Agent Config Profile to the intent and save the intent.

Step 4

Select the hosts whose IP addresses you want to change and ensure that these hosts have enforcement disabled.

Step 5

Change the IP address of the selected hosts.

Step 6

On the Secure Workload UI, update the filters in the scope by including the new IP addresses of these hosts.

Step 7

Under the Interfaces > Agent Workload Profile tab, verify that the IP address has changed to the new IP address.

Step 8

Under the Policies tab, ensure that the policies are generated with the new IP address.

Step 9

Remove the Intent or Profile created earlier.

Step 10

Click Enable Enforcement to enable the enforcement in the scope for the earlier Agent Config Profile that had enforcement disabled.


Frequently Asked Questions

This section lists some potential issues that you could possibly face during deployment and operating the software agents.

General

Log files: Log files get stored inside the <install-location>/logs or <install-location>/log folder. The log files get monitored and rotated through the Secure Workload services.

Agent deployment

Linux

Q: What do I do when the command
rpm -Uvh tet-sensor-1.101.2-1.el6-dev.x86_64.rpm
fails to install agents and displays the following error:

   error: cannot create transaction lock on /var/lib/rpm/.rpm.lock (Permission denied).

A: If you do not have the right privileges to install the agents, either switch to root or use sudo to install the agents.

Q: What happens when you run “sudo rpm -Uvh tet-sensor-1.0.0-121.1b1bb546.el6-dev.x86_64.rpm” and encounter the following error:


   Preparing...              ########################################### [100%]
   which: no lsb_release in (/sbin:/bin:/usr/sbin:/usr/bin:/usr/X11R6/bin)
   error: %pre(tet-sensor-site-1.0.0-121.1b1bb546.x86_64) scriptlet failed, exit status 1
   error:   install: %pre scriptlet failed (2), skipping tet-sensor-site-1.0.0-121.1b1bb546

A: The system does not satisfy the requirements to install the agents. In this particular case, lsb_release tool is not installed.

For more information, see the Software Agents Deployment Label section and install the required dependencies.

Q: What happens whn you run “sudo rpm -Uvh tet-sensor-1.0.0-121.1b1bb546.el6-dev.x86_64.rpm” and encounter the following error:


  Unsupported OS openSUSE project
  error: %pre(tet-sensor-1.101.1-1.x86_64) scriptlet failed, exit status 1
  error: tet-sensor-1.101.1-1.x86_64: install failed
  warning: %post(tet-sensor-site-1.101.1-1.x86_64) scriptlet failed, exit status 1

A: Your OS is not supported to run software agents (in this particular case, “openSUSE project” is a non- supported platform).

For more information, see the Software Agents Deployment Label section.

Q: After I have installed all the dependencies and run installation with proper privileges with no errors. How do I know the agents installation was successful?

A: After you have installed the agents, to verify if the installation, run the following command:


$ ps -ef | grep -e csw-agent -e tet-
root     14158     1  0 Apr03 ?        00:00:00 csw-agent
root     14160 14158  0 Apr03 ?        00:00:00 csw-agent watch_files
root     14161 14158  0 Apr03 ?        00:00:03 csw-agent check_conf
root     14162 14158  0 Apr03 ?        00:01:03 tet-sensor -f conf/.sensor_config
root     14163 14158  0 Apr03 ?        00:02:38 tet-main --sensoridfile=./sensor_id
root     14164 14158  0 Apr03 ?        00:00:22 tet-enforcer --logtostderr
tet-sen+ 14173 14164  0 Apr03 ?        00:00:21 tet-enforcer --logtostderr
tet-sen+ 14192 14162  0 Apr03 ?        00:07:23 tet-sensor -f conf/.sensor_config

You must see three entries of csw-agent and at least two entries of tet-sensor. If the services are not running, ensure that the following directories are available, else the installation has failed.

  • /usr/local/tet for most Linux distributions

  • /opt/cisco/tetration for AIX, Ubuntu

  • /opt/cisco/secure-workload for Solaris, Debian

Windows

Q: When I run the PowerShell agent installer script, I get one of the following errors:

  1. The underlying connection was closed: An unexpected error occurred on a receive.

  2. The client and server cannot communicate, because they do not possess a common algorithm

A: It is most likely because host and the server has mismatched SSL/TLS protocols configured. One can check the SSL/TLS version using the following command:

[Net.ServicePointManager]::SecurityProtocol
To set the SSL/TLS to be matching with server one can use the following command (note, this is not a permanent change, only temporary with the current PowerShell session):

[Net.ServicePointManager]::SecurityProtocol = [System.Net.SecurityProtocolType]’Ssl3,Tls,Tls11,Tls12’
Q: When I run the MSI installer from the downloaded bundle, I get the following error:

This installation package could not be opened. Verify that the package exists and that you can access it, or contact the application vendor to verify that this is a valid Windows Installer package.
A: Make sure C:\Windows\Installer path exists. If running the MSI installer from the command line, make sure to not include the relative path when pointing to the msi file. Example of correct syntax:

msiexec /i “TetrationAgentInstaller.msi” /l*v “msi_install.log” /norestart

Q: I have observed that Windows Sensor software fails to upgrade if underlying NIC is Nutanix VirtIO Network Driver.

A: There is an incompatibility issue between Npcap 0.9990 and Nutanix VirtIO Network Driver version earlier than 1.1.3 and Receive Segment Coalescing is enabled.

The resolution for this is to upgrade Nutanix VirtIO Network Driver to version 1.1.3 or later.

Q: I have installed windows sensor. The sensor doesnt seem to register and the sensor_id file contains the following: uuid-invalid-platform

A: You may not have system32 in PATH variable for Windows. Check if system32 is in PATH, if not run the following:

set PATH=%PATH%;C:\Windows\System32\

Q: I am not receiving the network flows from Kubernetes Pods on Windows Nodes.

A: To verify if the required sessions are running to capture the flows from Kubernetes pods on Windows nodes, perform the following:

  1. Run cmd.exe with administrative privileges.

  2. Run the following command: logman query -ets

    Ensure that the following sessions are running:

    • CSW_MonNet: Captures network flows

    • CSW_MonHCS: Monitors creation of pods

    • CSW_MonNat: Monitors NATed flows

Kubernetes

If the installer script fails during Kubernetes Daemonset Installation, there are a large number of possible reasons.

Q: Is the Docker Registry serving images reachable from nodes ?

A: Debug Direct or HTTPS Proxy issues with the cluster pulling images from Cisco Secure Workload cluster

Q: Is the container runtime complaining about SSL/TLS insecure errors ?

A: Verify that the Secure Workload HTTPS CA certificates are installed on all Kubernetes nodes in the appropriate location for the container runtime.

Q: Docker Registry authentication and authorization of image downloads failures ?

A: From each node, attempt to manually docker pull the images from the registry urls in the Daemonset spec using the Docker pull secrets from the secret created by the Helm Chart. If the manually image pull also fails, need to pull logs from the Secure Workload Cluster registryauth service to debug the issue further.

Q: Is the Kubernetes cluster hosted inside the Secure Workload appliance heathy ?

A: Check the service status page for the cluster to ensure all related services are healthy. Run the dstool snapshot from the explore page and retrieve the logs generated.

Q: Are the Docker Image Builder daemons running ?

A: Verify from the dstool logs that the build daemons are running.

Q: Are the jobs that build Docker images failing ?

A: Verify from the dstool logs that the images have not been built. Docker build pod logs can be used to debug errors during the buildkit builds. Enforcement Coordinator logs can also be used to debug the build failures further.

Q: Are the jobs creating Helm Charts failing ?

A: Verify from the dstool logs that the Helm Charts have not been built. Enforcement Coordinator logs will contain the output of the helm build jobs and can be used to debug the exact reason for the Helm Chart build job failures.

Q: Installation bash script was corrupt ?

A: Attempt to download the installation bash script again. The bash script contains binary data appended to it. If the bash script is edited in any way with a text editor or saved as a text file, special characters in the binary data may be mangled/modified by the text editor.

Q: Kubernetes cluster configuration – too many variants and flavors, we support classic K8s.

A: If the customer is running a variant of Kubernetes, there can be many failure modes at different stages of the deployment. Classify the failure stage - kubectl command run failure, helm command run failures, pod image download failures, pod privileged mode options rejected, pod image trust content signature failures, pod image security scan failures, pod binaries fail to run (architecture mismatch), pods run but the Secure Workload services fail to start, Secure Workload services start but have runtime errors due to unusual operating environment.

Q: Are the Kubernetes RBAC credentials failing ?

A: In order to run privileged daemonsets, we need admin privileges to the K8s cluster. Verify the the kubectl config file has its default context pointing towards the target cluster and admin-equivalent user for that cluster.

Q: Busybox image available or downloadable from all cluster nodes ?

A: Fix the connectivity issues and manually test that the busybox image can be downloaded. The exact version of busybox that is used in the pod spec must be available (pre-seeded) or downloadable on all cluster nodes.

Q: API Server and etcd errors or a general timeout during the install ?

A: Due to the instantiation of daemonset pods on all nodes in the Kubernetes cluster, the CPU/Disk/Network load on the cluster can spike suddenly. This is highly dependent on the customer specific installation details. Due to the overload, the installation process (images pulled on all nodes and written to disks) might take too long or overload the Kubernetes API server or the Secure Workload Docker Registry endpoint or, if configured, the proxy server temporarily. After a brief wait for image pulls on all nodes to complete and a reduction in CPU/Disk/Network load on the Kubernetes cluster nodes, retry the installation script again. API Server and etcd errors from the Kubernetes control plane indicate that the Kubernetes control plane nodes may be underprovisioned or affected by the sudden spike in activity.

Q: Secure Workload Agent experiencing runtime issues with its operations ?

A: Refer to the Linux Agent troubleshooting section if the pods are correctly deployed and the agent has started running but is experiencing runtime issues. The troubleshooting steps are the same once the Kubernetes deployment has successfully installed and started the pods.

Anomaly Types

These are the most common issues encountered on the workflow when using and managing Secure Workload Agents.

Agent Inactivity

Agent has stopped checking to the cluster services. This can happen due to several reasons:

  • The host might have been down

  • The network connectivity has been broken or blocked by firewall rules

  • The agent service has been stopped

All platforms
  • Verify the host is active and healthy

  • Verify the agent service is up and running

  • Verify the network connectivity to the cluster is working

Upgrade Failure

Agent upgrade has failed. This can be triggered by few cases such as:

  • Not finding the package when the check in script attempts to download it - the upgrade package cannot be unpacked or the installer from the package cannot be verified.

  • Installation process failing from an OS issue or dependency.

Windows
Linux
  • If the host OS has been upgraded since the last agent installation, verify the current release matches list of supported platforms in user guide: Check If Platform Is Currently Supported

  • Make sure there have been no changes to the required dependencies since the last installation. You can run the agent installer script with –no-install option to re-verify these dependencies.

  • Make sure there is enough free disk space on host

AIX
  • Make sure there have been no changes to the required dependencies since the last installation. You can run the agent installer script with –no-install option to re-verify these dependencies.

  • Make sure there is enough free disk space on host

Convert Failed

The current agent type mismatches desired agent type and the convert attempt has timed out. This issue can be caused by a communication issue when an agent does check_in to download the package, or wss service failed to push convert_commnad to the agent.

All Platforms

Convert Capability

The ability to convert the agent from one type (such as deep visibility) to another type (such as enforcement) is not available by all agents. If an agent that is not capable to do the conversion is required to convert, the anomaly will be reported.

Policy Out of Sync

The current policy (NPC) version last reported by the agent does not match the current version generated on the cluster. This can be caused by a communications error between the agent and the cluster, the agent failing to enforce the policy with the local firewall, or the agent enforcement service not running.

Windows
  • If enforcement mode is WAF, verify there are no GPOs present on the host that would prevent the Firewall from being enabled, adding rules (with Preserve Rules Off) or setting default actions: GPO Configurations

  • Verify there is connectivity between the host and the cluster: SSL Troubleshooting

  • Verify the generated rule count is less than 2000

  • Verify the WindowsAgentEngine service is running: sc query windowsagentengine

  • Verify there are available system resources

Linux
  • Verify iptables and ipset is present with the iptables and ipset command

  • Verify there is connectivity between the host and the cluster: SSL Troubleshooting

  • Verify the tet-enforcer process is running: ps -ef | grep tet-enforcer

AIX
  • Verify ipfilter is installed and running with the ipf -V command

  • Verify there is connectivity between the host and the cluster: SSL Troubleshooting

  • Verify the tet-enforcer process is running: ps -ef | grep tet-enforcer

Flow Export: Pcap Open

If the Secure Workload Agent cannot open the pcap device to capture flows, you see errors in the Agent logs. A successfully opened Pcap device will report as follows:

Windows Log: C:\Program Files\Cisco Tetration\Logs\TetSen.exe.log
I0609 15:25:52.354 24248 Started capture thread for device <device_name>
I0609 15:25:52.354 71912 Opening device {<device_id>}
Linux Log: /usr/local/tet/logs/tet-sensor.log
I0610 03:24:22.354 16614 Opening device <device_name>
[2020/06/10 03:24:23:3524] NOTICE: lws_client_connect_2: <device_id>: address 172.29.
˓→136.139

Flow Export: HTTPS Connectivity

Connectivity between the agent and the cluster is externally blocked therefore preventing flows and other system information from being delivered. This is caused by one or more configuration issues with network firewalls, SSL decryption services, or third party security agents on the host.

  • If there are known firewalls or SSL decryption security devices between the agent and the cluster, make sure that communications to all Secure Workload collector and VIPs IP addresses are being permitted. For on-prem clusters, the list of collectors will be listed under Troubleshoot > Virtual Machines in the navigation bar at the left side of the Secure Workload web interface. Look for collectorDatamover-*. For Secure Workload cloud, all the IP addresses that need to be permitted will be listed in your Portal.

  • To help identify if there is SSL decryption, openssl s_client can be used to make a connection and display the returned certificate. Any additional certificate added to the chain will be rejected by the Agent’s local CA. SSL Troubleshooting


    Note


    Typically, the service to update "flow export anomaly status” runs every 5 minutes. This duration may vary because the agents' status updates are being executed in small batches of 5000. Thus, when there are fewer agents in the cluster, the updates are faster. When there are larger number of agents, the updates can take a maximum of 70 minutes.

    After the initial sorting of agents records in the database, the cluster and agents become stable, and eventually, the update interval becomes lesser and more consistent.


Flow Export: eBPF

In version 3.10.1.1, Secure Workload agent uses Extended Berkeley Packet Filter (eBPF) on the following distributions:

  • Ubuntu 22/24

  • Debian 11/12

  • EL9 family

If the Secure Workload agent cannot use eBPF for flow capture, you see errors in the Agent logs.

A successful flow capture is reported for each network device as shown below:

Linux Log: /usr/local/tet/logs/tet-sensor.log
I0604 21:29:20.357 833292 Opening device <device_name>
I0604 21:29:20.394 833292 Using eBPF TC to capture packets for <device_name>

Certificate Issues

Windows

Certificate Issues for MSI installer

MSI installer is signed using code signing certificate:

For MSI Installer, version 3.6.x onwards and 3.5.1.31 onwards

  • Leaf Certificate: Cisco Systems, Inc

  • Intermediate Certificate: DigiCert Trusted G4 Code Signing RSA4096 SHA384 2021 CA1

  • Root Certificate: DigiCert Trusted Root G4

For MSI Installer, earlier versions

  • Leaf Certificate: Cisco Systems, Inc

  • Intermediate Certificate: Symantec Class 3 SHA256 Code Signing CA

  • Root Certificate: VeriSign Class 3 Public Primary Certification Authority - G5

It uses timestamp certificate:

For MSI Installer, version 3.6.x onwards and 3.5.1.31 onwards

  • Leaf Certificate: Symantec SHA256 TimeStamping Signer - G3

  • Intermediate Certificate: Symantec SHA256 TimeStamping CA

  • Root Certificate: VeriSign Universal Root Certification Authority

For MSI Installer, earlier versions

  • Leaf Certificate: Symantec SHA256 Timestamping Signer - G2

  • Intermediate Certificate: Symantec SHA256 Timestamping CA

  • Root Certificate: VeriSign Universal Root Certification Authority

Windows Sensor Installation or upgrade will fail if digital signature of MSI installer is invalid. Digital signature is invalid if

  • MSI Installer Signing Root Certificate or MSI Installer timestamp Root Certificate is not in a “Trusted Root Certification Authority” store

  • MSI Installer Signing Root Certificate or MSI Installer timestamp Root Certificate is expired or revoked.

Issue 1

Installation of agent might fail with below error in the TetUpdate.exe.log “Msi signature is not trusted. 0x800b0109"

Resolution
  • Run the command certmgr from command prompt

  • Check if MSI Installer Signing Root Certificate or MSI Installer timestamp Root Certificate is in Untrusted Certificates store.

  • Move it to Trusted Root Certification Authority store.

Issue 2

Windows Sensor upgrade fails with the following error in TetUpdate.exe.log “Msi signature is not trusted. 0x800B010C"

A certificate was explicitly revoked by its issuer.

Resolution
  • Run the command certmgr from command prompt

  • Check if MSI Installer Signing Root Certificate or MSI Installer timestamp Root Certificate is in Untrusted Certificates store.

  • Copy it to Trusted Root Certification Authority store.

Issue 3

Windows Sensor upgrade fails with the following in TetUpdate.exe.log “Msi signature is not trusted. 0x80096005"

Resolution
  • Run the command certmgr from command prompt

  • Check if MSI Installer Signing Root Certificate and MSI Installer timestamp Root Certificate is in “Trusted Root Certification Authority” store

If it the certificate is missing, import it from other machine.

To import the certificate, follow below steps:

First export the certificate VeriSign Universal Root Certification Authority from one of Working server. Follow below steps:

  • Run the command certmgr from command prompt

  • Right click on the certificate “VeriSign Universal Root Certification Authority” under “Trusted Root Certification Authorities” and go to All tasksExport.

  • Copy the exported certificate to the Non-working server and then import the certificate.

To import the certificate, follow below steps:

First export the certificate VeriSign Universal Root Certification Authority from one of Working server. Follow below steps:

  • Run the command certmgr from command prompt

  • Right click on the certificates tab under Trusted Root Certification Authorities and go to All tasksImport.

  • Select the Root certificate that you copied and add it in the store.

Certificate Issues for NPCAP installer

Applicable to Windows 2012 , Windows 2012 R2, Windows 8, Windows 8.1

NPCAP version: 1.55

NPCAP Signing Certificate:

  • Leaf Certificate: Insecure.Com LLC

  • Intermediate Certificate: DigiCert EV Code Signing CA (SHA2)

  • Root Certificate: DigiCert High Assurance EV Root CA

NPCAP Timestamp certificate:

  • Leaf Certificate: DigiCert Timestamp 2021

  • Intermediate Certificate: DigiCert SHA2 Assured ID Timestamping CA

  • Root Certificate: DigiCert Assured ID Root CA

Issue 1

Windows Agent Installation might fail with below error in msi_installer.log
CheckServiceStatus : Exception System.InvalidOperationException: Service npcap was not found on computer 
‘.’. —> System.ComponentModel.Win32Exception: The specified service does not exist as an installed service

Resolution

  • Run the command certmgr from command prompt

  • Check “DigiCert High Assurance EV Root CA” in “Trusted Root Certification Authority” store.

  • If it the certificate is missing, import it from other machine.

To import the certificate, follow below steps:

First export the certificate “DigiCert High Assurance EV Root CA” from one of Working server. Follow below steps:

  • Run the command certmgr from command prompt

  • Right click on the certificate “DigiCert High Assurance EV Root CA” under “Trusted Root Certification Authorities” and go to All tasksExport.

  • Copy the exported certificate to the Non-working server and then import the certificate.

To import the certificate, follow below steps:

  • Run the command certmgr from command prompt

  • Right click on the certificates tab under Trusted Root Certification Authorities and go to All tasksImport.

  • Select the Root certificate that you copied and add it in the store.

Applicable to Windows 2008 R2

NPCAP version: 0.991

NPCAP Signing Certificate:

  • Leaf Certificate: Insecure.Com LLC

  • Intermediate Certificate: DigiCert EV Code Signing CA

  • Root Certificate: DigiCert High Assurance EV Root CA

NPCAP Timestamp certificate:

  • Leaf Certificate: DigiCert Timestamp Responder

  • Intermediate Certificate: DigiCert Assured ID CA-1

  • Root Certificate: VeriSign DigiCert Assured ID Root CA

Issue 1

Windows Agent Installation might fail with below error in msi_installer.log
CheckServiceStatus : Exception System.InvalidOperationException: Service npcap was not found on
computer ‘.’. —> System.ComponentModel.Win32Exception: The specified service does not exist as an
installed service

Resolution

  • Run the command certmgr from command prompt

  • Check DigiCert High Assurance EV Root CA in Trusted Root Certification Authority store.

  • If it the certificate is missing, import it from other machine.

To import the certificate, follow below steps:

First export the certificate “DigiCert High Assurance EV Root CA” from one of Working server. Follow below steps:

  • Run the command certmgr from command prompt

  • Right click on the certificate “DigiCert High Assurance EV Root CA” under “Trusted Root Certification Authorities” and go to All tasksExport.

  • Copy the exported certificate to the Non-working server and then import the certificate.

To import the certificate, follow below steps:

  • Run the command certmgr from command prompt

  • Right click on the certificates tab under Trusted Root Certification Authorities and go to All tasksImport.

  • Select the Root certificate that you copied and add it in the store.

Windows Host Rename

Scenario 1: Not able to see IP Addresses and VRF info after renaming the Windows Host Steps to fix the issue:

  • Remove the entry(with new Hostname that is missing IP Addresses and VRF info) from the TaaS UI.

  • Uninstall ‘Cisco Secure Workload Agent’ from the Windows Host and delete the ‘Cisco Tetration’ directory (typically the path for this will be : ‘C:Program FilesCisco Tetration’).

  • Install ‘Cisco Secure Workload Agent’ on the Windows Host.

Following the above steps should register the Agent on the TaaS UI successfully with the IP Addresses and VRF info.

Scenario 2: Planned Windows Host rename (in advance) Steps to follow:

  • Uninstall ‘Cisco Secure Workload Agent’ from the Windows Host and delete the ‘Cisco Tetration’ directory (typically the path for this will be : ‘C:Program FilesCisco Tetration’).

  • Rename the Windows Host and Reboot.

  • Install ‘Cisco Secure Workload Agent’ on the Windows Host(with new Hostname).

Following the above steps for planned Host rename should register the Agent on the TaaS UI with new Hostname.

Check If Platform Is Currently Supported

AIX

  • Run the command uname -a

  • Note: The major and minor versions are reversed

    p7-ops2> # uname -a
    AIX p7-ops2 1 7 00F8AF944C00
  • In this example, the first number after the host name is the minor and the second number is the major version, so AIX version 7.1. Compare this release to what is listed here: Supported Platforms and Requirements

Windows Installer Issues

  • Make sure there is a C:\Windows\Installer directory. This is not visible in File Explorer, easiest way to verify is in a CMD session and running: dir C:\Windows\Installer

  • Check if the Windows Installer service is not disabled. It must be set to Manual

  • Check to see if there are no other errors being reported by Windows Installer. Check Windows System Event logs under Windows Logs > Application > Source > MsiInstaller

Required Windows Services

Below is a list of services, that when disabled, have been linked to installation issues of the agent. It is recommended these services are running during the initial installation and any upgrade of the Deep Visibility and Enforcement agents.

Table 6. Required Windows Services

Service

Purpose for installation

Device Setup Manager

Device driver management for the installation of the Npcap filter driver.

Device Install Service

Also used for the installation of the Npcap filter driver.

Windows Installer

Required for the installation of agent MSI package.

Windows Firewall

Required for WAF enforcement mode.

Application Experience

Used to determine capatibility executables on the system.


Note


Application Experience service only applies to Windows Server 2008, 2008R2, 2012, 2012R2 and Win- dows 7. If disabled, a file lock may occur during Npcap installation causing it to fail.


Npcap Issues

Npcap is a pcap tool used for Windows Agent only. Ten seconds after the agent service starts, it will attempt to install or upgrade Npcap to the supported version. If Npcap service fails to install or upgrade, the agent will retry the installation within the next 30 minutes. After 3 failed attempts, the agent will attempt to rollback Npcap to a previous supported version if available. After, the agent will no longer try to install Npcap. You can check C:\Program Files\Cisco Tetration\Logs\TetUpdate.exe.log and C:\Program Files\Cisco Tetration\Logs\npcap_install.log to identify the error.

Npcap will not upgrade (manually or via agent)

  • Npcap will sometimes not uninstall correctly if a process is currently using the Npcap libraries. To check for this run the following command:

    PS C:\Program Files\Npcap> .\NPFInstall.exe -check_dll
    WindowsSensor.exe, Wireshark.exe, dumpcap.exe
    

If you see processes listed, they must be stopped before the Npcap upgrade can continue. If no processes are using Npcap the above command will simply show <NULL>

Verify if Npcap is fully installed

Procedure

Step 1

Check Control Panel > Programs and Features to see if Npcap is listed as an installed application

Step 2

Make sure the Npcap Packet Driver has a binding to the NIC in question (checkmark is present)

Step 3

Check if the network driver is installed

C:\Windows\system32>pnputil -e | findstr Nmap
Driver package provider : Nmap Project

Step 4

Check if the driver service is installed and RUNNING

C:\Windows\system32>sc query npcap
SERVICE_NAME: npcap
         TYPE : 1 KERNEL_DRIVER
         STATE : 4 RUNNING

Step 5

Check if the registry entry is there (used by the agent to verify Npcap exists already)

C:\Windows\system32>reg query HKLM\software\wow6432node\npcap
HKEY_LOCAL_MACHINE\software\wow6432node\npcap
          AdminOnly REG_DWORD 0x1
          WinPcapCompatible REG_DWORD 0x0
         (Default) REG_SZ C:\Program Files\Npcap

Step 6

Check if the installed Npcap program files are all there

C:\Windows\system32>dir "c:\program files\npcap"
 Directory of c:\program files\npcap
04/29/2020 02:42 PM <DIR> .
04/29/2020 02:42 PM <DIR> ..
01/22/2019 08:16 AM 868 CheckStatus.bat
11/29/2016 03:43 PM 1,034 DiagReport.bat
12/04/2018 11:12 PM 8,908 DiagReport.ps1
01/09/2019 09:22 PM 2,959 FixInstall.bat
04/29/2020 02:42 PM 134,240 install.log
01/11/2019 08:52 AM 9,920 LICENSE
03/14/2019 08:59 PM 10,434 npcap.cat
03/14/2019 08:57 PM 8,657 npcap.inf
03/14/2019 09:00 PM 74,040 npcap.sys
03/14/2019 08:57 PM 2,404 npcap_wfp.inf
03/14/2019 09:00 PM 270,648 NPFInstall.exe
04/29/2020 02:42 PM 107,783 NPFInstall.log
03/14/2019 09:01 PM 175,024 Uninstall.exe
         13 File(s) 806,919 bytes
          2 Dir(s) 264,417,628,160 bytes free

Step 7

Check to see if the .sys driver file is in the Windows driver folder

C:\Windows\system32>dir "C:\Windows\System32\Drivers\npcap.sys"
Directory of C:\Windows\System32\Drivers
03/14/2019 09:00 PM 74,040 npcap.sys
                  1 File(s) 74,040 bytes

Network Connectivity issues during NPCAP installation or upgrade

Applicable to Windows 2016 Only

If you have a 3rd party LWF (Light Weight Filter) driver (e.g. netmon) or a teaming adapter is configured in your setup, and NPCAP is installed during agent deployment, you might experience

RDP is reconnected

NetBios service is restarted

Similar network connectivity issues

This is due to a BUG in Windows 2016 OS

NIC teaming compatibility issues with NPCAP

Teaming NIC functionality is based on underneath Physical NICs (Intel, Broadcom, Realtek, MS virtual adapter etc) and Teaming driver configuration (switch based, loadbalancing or failover, algorithm to distribute the packets across multiple NICs).

Some NPCAP versions have compatibility issues with Teaming NICs, especially during binding to the underneath Teaming NICs.

The current Secure Workload Sensor software is tested using Microsoft supported NIC teaming.
NIC type : Intel(R) 82574L Gigabit Network Connection
Teaming Mode : Switch Independent
Load Balancing Mode: Address Hash
OS : Windows 2012 , Windows 2012 R2, Windows 2016, Windows 2019
NPCAP version: 1.55
.

Note


Windows 2008R2 does not support Microsoft supported NIC teaming.


VDI instance VM does not report network flows

The TetSensor service occasionally does not capture the network flows on cloned VMs when NPCAP service is running. This can happen when the agent is installed without the nostart flag using MSI installer or without goldenImage flag using PowerShell Installer on a VM template or golden image.

In this case, Secure Workload agent services start running on the VM template. NPCAP is installed and bound to the Network stack on the VM template. When a new VM is cloned from the VM template, NPCAP is not properly bound to the Network stack on the new cloned VM. As a result, NPCAP fails to capture the network flows.

Network Performance with NPCAP

It is observed that Network performance will be affected when Windows TetSensor service is running. Windows Tet- Sensor service (tetsen.exe) captures the network flows using NPCAP. NPCAP implementation to capture the network flows and the network flows to the tetsen.exe affects the network performance.

Compare the Network Performance after installing tetsensor, Client : Windows 2016

NPCAP 1.55

TetSensor Config : Conversation Mode with Enforcement mode WFP

Server : Windows 2016

NPCAP 1.55

TetSensor Config : Conversation Mode with Enforcement mode WFP

Run cmd : iperf3.exe -c <server_ip> -t 40

Table 7. 121071: Network Performance with NPCAP 155

Setup

Network Performance

No TetSensor Installed

NO NPCAP

[ ID] Interval Transfer Bandwidth

[ 4] 0.00-40.00 sec 18.2 GBytes 3.90 Gbits/sec sender

[ 4] 0.00-40.00 sec 18.2 GBytes 3.90 Gbits/sec receiver

TetSensor Installed

NPCAP Installed

[ ID] Interval Transfer Bandwidth

[ 4] 0.00-40.00 sec 17.3 GBytes 3.72 Gbits/sec sender

[ 4] 0.00-40.00 sec 17.3 GBytes 3.72 Gbits/sec receiver

Network Performance with NPCAP 0.9990

Compare the Network Performance after installing tetsensor, Client : Windows 2016

NPCAP 0.9990

TetSensor Config : Conversation Mode with Enforcement mode WFP

Server : Windows 2016

NPCAP 0.9990

TetSensor Config : Conversation Mode with Enforcement mode WFP

Run cmd : iperf3.exe -c <server_ip> -t 40 .. table:: Network Performance with NPCAP 0.9990

class longtable

Setup

Network Performance

TetSensor Installed

NPCAP Installed

[ ID] Interval Transfer Bandwidth

[ 4] 0.00-40.00 sec 16.3 GBytes 3.50 Gbits/sec sender

[ 4] 0.00-40.00 sec 16.3 GBytes 3.50 Gbits/sec receiver


Note


Performance may vary based on Windows NPCAP version installed, Windows OS, and network Configuration.


OS Performance and/or stability Issues

OS may experience unknown performance or stability issues if the installed NPCAP version or NPCAP configuration is not supported by the Secure Workload Software.

Supported NPCAP Version: : 0.991 and 1.55

GPO Configurations

Agents that enforce policy require only the Firewall to be enabled with either a local setting or GPO. All other GPO settings should not be set and left as “Not Configured.”

  • To check if a GPO setting is blocking enforcement you can check the C:\Program Files\Cisco Tetra- tion\Logs\TetEnf.exe.log log and search for the following error examples:

  • Rules conflicting with “Preserve Rules=No” setting: “There are firewall rules set in the Group Policy. Secure Workload agent does not have permission to remove these”

  • Firewall set to off: “GPO has disabled firewall for DomainProfile”

  • Default Action is set: “Group Policy has conflicting default inbound action for DomainProfile”

  • To check what GPO policies are being applied to the host, run gpresult.exe /H gpreport.html and open the generated HTML report. In the example below Secure Workload Agent Firewall is applying a Inbound rule which will conflict with Enforcement if “Preserve Rules” is set to “No.”

Agent To Cluster Communications

The Secure Workload Agent maintains connections to the cluster over multiple channels. Depending on the type of Agent, the number of connections varies.

Types of connections

  • WSS: Persistent socket connection over port 443 to the cluster

  • Check in: A HTTPS call to the cluster every 15-20 minutes to check for current configurations, check for updates and to update the active state of the agent to the cluster. This also reports upgrade failures.

  • Flow export: Persistent SSL connection over port 443 (TaaS) or 5640 (On-premise) to send flow metadata to the cluster

  • Enforcement: Persistent SSL connection over port 443 (Taas) or 5660 (On-premise) to pull in enforcement policies and report enforcement state

Checking the connection state

The Teration UI will report either an inactive agent (no longer checking-in), no exported flows (on Agent Workload Profile page under Stats), or failed enforcement. Depending on the error, you can check different logs on the workload to help determine the source of the issue.

Inactive Agent

Windows Log: C:\Program Files\Cisco Tetration\Logs\TetUpdate.exe.log

Linux Log: /usr/local/tet/logs/check_conf_update.log

An HTTP response code of 304 is expected and means there is no configuration change. Error code = 2 is expected as well. Any other HTTP response code will indicate a issue talking to the WSS service on the Secure Workload cluster.

Tue 06/09/2020 17:25:25.08 check_conf_update: "curl did not return 200 code, it's 304,
˓→ exiting"
Tue 06/09/2020 17:25:25.08 check_conf_update: "error code after running check_conf_
˓→update = 2"
  • 304 Expected, no config change. Successful check-in

  • 401 Registration is not successful, missing Activation Key (TaaS)

  • 403 Agent already registered to the cluster with same UUID

  • 000 Indicates connection issue with SSL. Either curl could not reach the WSS server or there is a issue with the certificate. See SSL troubleshooting: SSL Troubleshooting

No exported flows

Windows Log: C:\Program Files\Cisco Tetration\Logs\TetSen.exe.log

Linux Log: /usr/local/tet/logs/tet-sensor.log

The following indicates a successful connection to WSS

cfgserver.go:261] config server: StateConnected, wss://<config_server_ip>:443/wss/
˓→<sensor_id>/forensic, proxy:

The following indicates a successful connection to the Collectors

collector.go:258] next collector: StateConnected, ssl://<collector_ip>>:5640

If there are errors connecting to either WSS or the Collectors, check your firewall configuration or verify if any SSL decryption is occurring between the agent and Secure Workload. See: SSL Troubleshooting

Failed to enforce policy

Windows Log: C:\Program Files\Cisco Tetration\Logs\TetEnf.exe.log

Linux Log: /usr/local/tet/logs/tet-enforcer.log

ssl_client.cpp:341] Successfully connected to EFE server

If there are errors connecting to the EFE server, check your firewall configuration or verify if any SSL decryption is occurring between the agent and Secure Workload. See: SSL Troubleshooting

SSL Troubleshooting

Agent Communications Overview

Secure Workload agents use TLS to secure the TCP connections to the Secure Workload Cloud SaaS servers. These connections are broken down into three distinctive channels.

  • Agent -> Cisco Secure Workload SaaS control channel over port TCP/443 (TLS) (sensorVIP)

    This is a low volume control channel that allows the agent to register with Secure Workload and also handles configuration pushes and software upgrade notifications.

  • Agent -> Cisco Secure Workload SaaS flow data over TCP/443 (TLS) (collector)

    Flow data is the extracted flow metadata information; the data will be sent to 1 set of 16 IP addresses at a time. The second set of IP addresses is for standby. This is around 1 – 5% of actual server traffic.

  • Agent ->Cisco Secure Workload SaaS enforcement data over TCP/443 (TLS) (efe)

    The enforcement data channel is a low volume control channel that is used to push the policies to the sensors and also gather enforcement statistics.

The sensor validates the TLS certificate from the Secure Workload Cloud control, data and enforcement servers against a local CA that is installed with the agent. No other CAs are used, so any other certificate sent to the agent will result in a verification failure and the agent will not connect. This will result in the agent not registering, checking-in, sending flows or receiving enforcement policies.

Configuring IP traffic for Agent Communications

A typical configuration for most will be to have a perimeter firewall and possibly a proxy between the agents (workflows) and Secure Workload TaaS.


Note


Secure Workload gathers your gateway/NAT IP information during the on-boarding and automatically adds the information at the time of tenant creation. If you add new IP addresses or change IP addresses in the portal, the changes require review and approval by Secure Workload staff.


In addition to adding your gateway/NAT IP addresses in the TaaS portal, there might be more changes required to your network to allow the traffic outbound and unmodified:

  • Allow outbound port 443 over TLS/HTTPS on the perimeter firewall.

  • Configure proxy bypass and SSL/TLS bypass on the web proxy, if a decrypting web proxy is being used.

  • If you are using a transparent web proxy at the data center, you must route the specific SaaS IP address and configure the bypass rules. Sensors are connections that cannot do automatic HTTPS redirection.

The list of IPs the agents communicates with is available on the TaaS portal. The IPs to add to your firewall outbound configuration and proxy bypass are labeled collector-n, efe-n (only if enforcement is being deployed), and sensorVIP. There are typically 17 to 33 IPs to add for agent communication, but there could more or less depending on your TaaS configuration.

Troubleshooting SSL/TLS Connections

As discussed in the previous section, it is important to configure your explicit or transparent web proxy to bypass SSL/TLS decryption for agent communications. If the bypass is not configured, these proxies might attempt to decrypt

SSL/TLS traffic by sending its own certificate to the agent. Because the agent only uses its local CA to validate the certificate, these proxy certificates will cause connection failures.

Symptoms include agent failing to register to the cluster, agent not checking-in, agent not sending flows, and/or agent not receiving enforcement configuration (if enforcement is enabled).


Note


Troubleshooting steps below are assuming default installation paths were used. Windows: C:Program FilesCisco Tetration Linux: /usr/local/tet. If you installed your agents in a different location, substitute that location in the instructions.


SSL/TLS Connection issues are reported in the agent logs. To verify if there are SSL errors in the logs, run the following commands for the associated issue being observed.

Registration, check-in

Linux

grep "NSS error" /usr/local/tet/log/check_conf_update.log

Windows (PowerShell)

get-content "C:\Program Files\Cisco Tetration\logs\TetUpdate.exe.log" | select-
˓→string -pattern "curl failed SSL peer certificate"
Flows

Most of the SSL/TLS connection issues seen are during the initial connection and registration of the agent. Sending flows relies on the registration to be complete before attempting to connect. SSL/TLS errors seen here would be the result of the sensorVIP IPs being allowed but not the collector IPs.

Linux

grep "SSL connect error" /usr/local/tet/log/tet-sensor.log

Windows (PowerShell)

get-content "C:\Program Files\Cisco Tetration\logs\WindowsSensor*.log" | select-
˓→string -pattern "Certificate verification error"
Enforcement

Linux

grep "Unable to validate the signing cert" /usr/local/tet/log/tet-enforcer.log

Windows (PowerShell)

get-content "C:\Program Files\Cisco Tetration\logs\WindowsSensor*.log" | select-
˓→string -pattern "Handshake failed"

If an SSL error is seen in the log checks above you can verify what certificate is being sent to the Agents with the following commands.

Explicit Proxy - where a proxy is configured in user.cfg

Linux

curl -v -x http://<proxy_address>:<port> https://<sensorVIP>:443

Windows (PowerShell)

cd "C:\Program Files\Cisco Tetration"
.\curl.exe -kv -x http://<proxy_address>:<port> https://<sensorVIP>:443

Transparent Proxy - No user.cfg proxy configuration required. It’s a proxy configured between all HTTP(S) traffic from agent to the internet.

Linux

openssl s_client -connect <sensorVIP from TaaS Portal>:443 -CAfile /usr/local/tet/
˓→cert/ca.cert

Windows (PowerShell)

cd C:\Program Files\Cisco Tetration
.\openssl.exe s_client -connect <sensorVIP from TaaS Portal>:443 -CAfile cert\ca.cert

You are looking for the following in the openssl s_client respose

Verify return code: 0 (ok)

If you see an error, examine the certificate. An example certificate (chain) should include only the following cert (CN IP is an example):

Certificate chain

0 s:/C=US/ST=CA/L=San Jose/O=Cisco Systems, Inc./OU=Tetration, Insieme BU/CN=129.146.
˓→155.109
i:/C=US/ST=CA/L=San Jose/O=Cisco Systems, Inc./OU=Tetration Analytics/CN=Customer CA

If you see additional certificates, then there is possibly a Web decrypting proxy between the agent and Secure Workload. Contact your security or network group and verify if the proxy bypass is configured using the listed IPs from the Configuring IP traffic for Agent Communications section.

Windows sensor installation script fails on Windows 2016 servers: Error message that might appear “The underlying connection was closed: An unexpected error occurred on a receive.” Possible reason might be the SSL/TLS versions set in PowerShell.

To check the SSL/TLS versions running, run the following command:

[Net.ServicePointManager]::SecurityProtocol

If the output from the above command is:

Ssl3, Tls

Then use the below command to change the allowed protocols and retry the installation:

[Net.ServicePointManager]::SecurityProtocol = [System.Net.SecurityProtocolType]'Ssl3,
˓→Tls,Tls11,Tls12'

Agent operations

Q: I have installed the agents successfully, but I didn’t see it on UI Sensor Monitoring page.

A: An agent is required to register with backend server running within cluster before it could start operating. When an agent is not shown on UI page, most likely it’s because the registration has failed. There are a few things we could check to see why a registration failed:

  • Check if the connection between the agent and the backend server is working properly

  • Check if the curl request could be sent to backend server properly

  • Check HAProxy access and backend server logs to see if the registration request made it to the server

  • Check the error return from curl request in the log file

Q: The agent is installed and I could find in on UI page. However, the “SW Ver” column shows “initializing” instead of a version string.

A: After the initial agent is installed and registered with the backend server, it would take another 30 minutes for the agent to report its version.

Q: The agent is upgraded properly, but the “SW Ver” fields still show the old version after a long time (like several hours).

A: After the agent is upgraded successfully, it will try to send a curl request to report its current running version and check for new version in the same request. It is possible that the request couldn’t make it to the backend, due to several reason:

  • The request is timed out, couldn’t get the response in time

  • The network is facing problem, agent couldn’t connect to backend servers

Q: I have an agent running on RHEL/CentOS-6.x and it is working properly. I am planning to upgrade the OS to RHEL/CentOS-7.x. Would the agent still work after the upgrade?

A: currently we do not support the scenario in which the OS has been upgraded, especially upgrading the major releases. In order to have the agent work after OS upgrade, do the following steps:

  • Uninstall the existing agent software

  • Clean up all files, including certs

  • Go to UI, delete the agent entry

  • Upgrade the OS to the desired version

  • Install the agent software on the new OS

Q: I have an agent running on RHEL/CentOS-6.x and it is working properly. I am planning to rename the host. Would the agent still work after rename/reboot?

A: An agent identity is calculated based on the host’s uniqueness, including hostname and bios-uuid. Changing hostname changes the host’s indentify. It is recommended to do the following:

  • Uninstall the existing agent software

  • Clean up all files, including certs

  • Go to UI, delete the old agent entry

  • Rename the host and reboot

  • Install the agent software again

Q: On Windows host, firewall deviation was caused by adding/deleting/modifying a rule. How do I find the rule?

A: On deviation detection, agent logs the last 15 seconds of firewall events to “C:\Windows\System32\config\systemprofile\AppData\Roaming\tet\firewall_events”. Rule that caused deviation will be found in the latest file created as policy_dev_<policy id>_<timestamp>.txt

Q: I have installed the agent on a Windows host successfully. Why do I not see any reported flows from the sensor?

A: Npcap is required to collect flows on a Windows host. Ten seconds after the agent is installed successfully, it will install Npcap. If the sensor does not report flows after several minutes, check if the agent and the backend server is connected and if Npcap is installed properly on the NPCAP Issues.

Q: I have installed the agent on Windows host, 2008 R2, successfully. Why does the system clock drift when tetsensor service is running?

A: This is a known problem with Go and Windows 2008 R2. For more information, see Golang and Win2008 R2.

The process, tet-main.exe, running as a part of tetsensor service, is built using Go Version 1.15. That is why the system clock drifts when the tetsensor service is running.

This issue occurs when Windows 2008 R2 workload is configured to use the external NTP server or Domain Controller as NTP server.

The possible work around :

  1. Periodially force NTP to sync the clock: w32tm /resync /force

  2. Disable tet-main.exe manually.

    • Run cmd.exe with “administrator” privilege.

    • Run regedit.exe

    • Go to “HKEY_LOCAL_MACHINE\SYSTEM\ControlSet001\Services\TetSensor”

    • Double click on “ImagePath”

    • Edit value, remove tet-main.exe

      before “C:\Program Files\Cisco Tetration\TetSenEngine.exe” TetSensor TetSen.exe “-f sensor_config” tet-main.exe ” ” TetUpdate.exe

      after “C:\Program Files\Cisco Tetration\TetSenEngine.exe” TetSensor TetSen.exe “-f sensor_config” TetUpdate.exe

    • Restart tetsensor service


      Note


      Disable tet-main.exe after every time agent is upgraded.


  3. Remove external NTP server configuration:

    • Run command : w32tm /config /update /manualpeerlist: /syncfromflags:manual /reliable:yes

    • Restart Windows Time Service, W32Time

Q: Why does agent rehoming fail when the internal DNS does not resolve public domains or when hosts use a proxy for external domain resolution?

A: Agent rehoming fails due to Web Services Security (WSS) name resolution validation failure when the internal DNS does not resolve public domains or hosts use proxy for external domain resolution,

To resolve this issue, perform one of the following:

  • During the agent rehoming configuration, use the Sensor Virtual IP (VIP) instead of the Sensor VIP FQDN.

  • Update the hosts file on the workloads to include the Sensor VIP and FQDN. The hosts file can be located at the following paths:

    • Windows: C:\Windows\System32\drivers\etc\hosts

    • Linux/Unix: /etc/hosts

  • Add a temporary DNS record for the Sensor VIP FQDN in the internal DNS server.

Agent Troubleshooting Tool

Agent troubleshooting tool allows you to troubleshoot common issues with your Agents in a Windows environment. This tool is a PowerShell script with several parameters that allows you to troubleshoot different aspects of your Agents. Following PowerShell parameters can be used to troubleshoot different aspects of the Agents:
  • -agentHealth: This option checks the health of your Agents and reports any issues that need to be addressed.

  • -agentRegistration: This option allows you to check for any known issues with agent registration.

  • -agentUpgrade: This option allows you to check for any known issues with agent upgrades.

  • -enforcementHealth: This option checks the overall enforcement health and ensures the latest policies are programmed.

  • -collectLogs: This option collects debug logs, which can be analyzed for further troubleshooting.

To run the Agent troubleshooting tool script, follow these steps:

Procedure


Step 1

Open Windows (PowerShell) as an Administrator.

Step 2

Navigate to the CSW installation directory (The default location of this directory is: "C:\Program Files\Cisco Tetration").

Step 3

Run the script using the following command:

.\AgentTroubleshooting.ps1

For example, to check the health of your Agents, run the script with the -agentHealth parameter:

.\AgentTroubleshooting.ps1 -agentHealth