Have an account?

  •   Personalized content
  •   Your products and support

Need an account?

Create an account

Data Handling and Privacy for Cognitive Collaboration White Paper

Networking Solutions Island of Content Event

Available Languages

Download Options

  • PDF
    (2.2 MB)
    View with Adobe Reader on a variety of devices
Updated:January 17, 2020

Available Languages

Download Options

  • PDF
    (2.2 MB)
    View with Adobe Reader on a variety of devices
Updated:January 17, 2020
 

 

Purpose

The purpose of this document is to explain the various aspects of data handling, privacy, and processing associated with Artificial Intelligence (AI) and Machine Learning (ML) features used within the Cisco Webex® portfolio.

Cisco® Cognitive Collaboration capabilities and products covered by this document include:

     Webex Assistant for conversational AI in Webex Devices

     Facial recognition for name labelling in Webex Meetings and Devices

     Machine learning-based noise detection and suppression in Webex Meetings and Devices

     People Insights in Webex Meetings and Webex Teams

     Meeting Transcription in Webex Meetings

This document offers a transparent view of data handling for the cognitive collaboration capabilities within Cisco Webex. Note that this document is not intended to be an overview of all aspects of AI features and focuses on data privacy and handling only. The content in this document is subject to change.

Guiding design principles for data privacy

For features and products using AI and ML, the overarching design principle is to never retain unnecessary Personally Identifiable Information (PII) from customer data. Of course, as these ML-based features rely on data, some customer data may be required for feature operation and in that case, data is only retained where needed and for the shortest possible timeframe.

Read more information about Cisco and data privacy:

     Cisco online privacy statement

     Cisco data protection and privacy

Webex Assistant

Webex Assistant is a voice-controlled digital assistant that helps with many collaboration tasks. Webex Assistant is available on cloud-connected Webex Room Series Devices.

Enabling Webex Assistant

Webex Assistant is disabled by default and can be enabled in Webex Control Hub by checking the “Enable Webex Assistant” check box (Figure 1). This initiates a series of onboarding workflows that import user data from existing Webex microservices. User data is processed to model organizational and interaction distance for users. The imported raw data is processed only to generate these models and is not retained.

Enabling Webex Assistant

Figure 1.         

Enabling Webex Assistant

Webex Assistant uses the Google Speech engine for speech-to-text and text-to-speech. Once the text is generated, Webex Assistant uses the MindMeld conversational AI platform which was recently open-sourced by Cisco to process the user’s intent. During onboarding, the device endpoints are notified that Webex Assistant is enabled. This results in devices fetching a speech engine token to authenticate with the Google Speech engine. At this point Webex Assistant is operational. Speech engine key rotation occurs on a weekly basis.

Runtime operation

The high-level dataflow for Webex Assistant is shown in Figure 2. Webex Assistant is activated by the wake word, “OK, Webex.” The wake word is detected by the local active microphone on the device. The wake word detection process runs locally on the device. Once the wake word is detected, speech is streamed to the cloud for speech-to-text transcription. As wake word processing is local on the device, no audio data is streamed to the cloud until the wake word is detected. This is reflected on-screen with state transitions and the resulting real-time speech-to-text transcription is displayed to the user. Note that the wakeword respects device features, for example, if mute is active on the device the wakeword cannot be detected.

Webex Assistant data flow and architecture

Figure 2.         

Webex Assistant data flow and architecture

The resulting text from the speech engine is returned to the Webex Assistant client on the endpoint device. The client securely manages functional interactions with Google Speech over Transport Layer Security (TLS) to and from the following endpoints. Manual HTTP basic- authentication proxy is supported.

     speech.googleapis.com:443

     texttospeech.googleapis.com:443

On-device token rotation for the speech engine occurs on a less-than-hourly basis. Speech audio data sent to the Google Speech engine is only processed. It is not retained by Google and is not used to train or improve Google’s speech accuracy as per the enterprise license agreement.

Webex Assistant can be turned on or off from the device using the Webex Touch 10 controller by selecting the device name and settings, under advanced settings. If an individual user leaves the organization, that user will be removed from the index, but data that has been anonymized may be retained. The retained data exists in a way that it is not associated with any individual user. If the entire service is disabled in Control Hub (opted out) for a given organization, it results in the deletion of all organization-specific machine learning data and models. No machine learning data relating to that organization is retained by Webex Assistant.

Face recognition

Facial recognition is used to recognize individuals and display their name labels in meetings. Organizations and individual end users must opt in to use facial recognition.

Service enrollment

As consent of both the organization and end user is desired, facial recognition is disabled by default and must be enabled by an organization’s administrator.

The Webex Control Hub screen in Figure 3 shows how the service is enabled and opted into by an organization admin. As the feature is in field trial currently, the functionality represented in Figure 3 is subject to change. If an organization admin disables facial recognition it will result in all data for all users being deleted.

Enabling face recognition

Figure 3.         

Enabling face recognition

Invite users to opt-in to facial recognition

Figure 3a.

Invite users to opt-in to facial recognition

User enrollment

Users must enroll for facial recognition. This serves two purposes:

1.     The user takes an intentional step to opt in to the feature

2.     The user provides an image of their face for the system to generate a feature vector

The enrollment step is shown in Figures 4 through 8. Figure 4 shows the feature description and Figure 5 shows the data privacy policy, both of which explain to the user how data is handled for use with facial recognition.

Face recognition enrollment page

Figure 4.         

Face recognition enrollment page

Opt-in data notice

Figure 5.         

Opt-in data notice

When the user begins the enrollment process, their browser will request consent to use the camera in order to capture the face image, as shown in Figure 6.

Browser request to use the camera

Figure 6.         

Browser request to use the camera

Once the camera is available, the user interface will ask the user to position their face in an area to capture the enrollment face image, as shown in Figure 7.

Image enrollment

Figure 7.         

Image enrollment

Once the image is successfully captured, the user is informed that they are successfully enrolled and where they can go to access and control their enrollment data, as shown in Figure 8.

Enrollment completion

Figure 8.         

Enrollment completion

When a user returns to their settings page, they can toggle the name label feature on or off to contribute new photos and to delete their data. This is illustrated in Figure 9.

User data controls

Figure 9.         

User data controls

If a user decides to delete their data, they are informed of how the data is handled, as shown in Figure 10.

Deletion information

Figure 10.     

Deletion information

For as long as the user contributes their enrolled images they are stored securely, encrypted in a Webex microservice data store. If the user leaves the organization, their data is automatically deleted.

Once a user is enrolled, the images they contribute are processed to generate a feature vector for the user’s face. The enrolled image is stored in the Cisco Webex cloud so that if an optimized neural network model is deployed, the resulting improvements will occur without the user having to re-enroll in the service. The neural network models are pre-trained by Cisco and do not contain customer data. The enrolled image is used only in the context of the user’s organization. In the future, Cisco may introduce an opt-in feature provided to customers under admin control for enrolled users to provide feedback on the quality and accuracy of the facial recognition feature.

In operation, when a face is detected, a feature vector for the user is calculated by inferencing on the device. The feature vector is matched against known vectors from the enrollment process and when a match is found, the resulting name data is retrieved, resulting in a name label being displayed for the user. This process is outlined in Figure 11.

Face recognition relative to enrollment

Figure 11.     

Face recognition relative to enrollment

It is important to note that outside of enrollment, images of the users are not stored in the Cisco cloud. Once a vector match is detected, the image is deleted and does not persist after a given call. Facial recognition was designed this way, specifically with data privacy in mind.

Background noise detection

Noise detection in Webex Meetings clients and Room Series endpoints applies pre-trained supervised machine learning models locally on the client or device to identify specific background noises in the media path. If a noise is detected for a Webex Meetings user on a personal device, like a laptop or PC, the user is notified of the background noise with the suggestion to mute. When a noise is detected on a Webex room devices, the audio is automatically suppressed until someone in the room begins speaking again. It does this locally on the client or device and not as a cloud service. No data goes to the cloud for noise detection or suppression. A trained model based on samples of known background noises is deployed directly on the client or device, as shown in Figure 12.

Noise detection operation

Figure 12.     

Noise detection operation

People insights

People Insights provides detailed profile and company information in Webex Meetings and Webex Teams. It is important that users have control and edit capabilities for their profiles. The levels of control granularity and edit features are highlighted in the next section.

Data sources

People Insights profiles created from two primary categories of data: 1) Public data gathered from across the web; and 2) Corporate directory data (i.e., Active Directory).

Public data

Public data is gathered by crawling billions of pages across the web, discovering pages containing professional information, and applying artificial intelligence algorithms to extract, label, and structure that data. All such collected data is run through a clustering process to determine which data points belong together as part of the same person’s profile. For example, if there are 10,000 pieces of data labeled ‘John Smith,’ People Insights algorithms determine which groups of data points belong to an individual, such as “John Smith, the accountant at company XYZ” to create a People Insights profile that is distinct from all the other John Smiths in the world. Clustered data points are constantly combined, handling any conflicting information to create our end result—a rich personal and professional overview combining disparate data sources from across the web.

It is important to note that all webpages used for data collection are fully publicly available. No LinkedIn data or any data that sits behind security barriers such as logins or paywalls is used in collating the public data.

When users actively interact with the People Insights feature—for example, loading their profile in people. webex.com or viewing People Insights data in Webex Meetings—we initiate a targeted discovery process to specifically enrich their profile and ensure we have discovered and ingested the most up-to-date public sources of data for that user.

Note however, the user must actively interact with their People Insights profile to trigger this process. Merely belonging to an organization in which the feature is enabled or joining a meeting which is equipped with People Insights will not trigger this process.

Corporate directory data

In order to activate People Insights for an organization, customers are required to synchronize their corporate directory data. When this is done, public and user entered data will integrate with the customer’s directory and present an enriched profile that provides both the publicly sourced data and directory information. As noted earlier, if enabled, this functionality allows users to see internal titles, internal contact information, and reporting structures only for colleagues at their own organization. If a user joins a Webex Meeting or is active on Webex Teams but is not part of the organization that has enabled People Insights (i.e., an external user), they will not see any directory data.

External users will still be able to see an individual’s profile that has been populated with publicly available information, or information that the individual has updated on people.webex.com, as long as that individual has not hidden their profile. Data for user profiles is sourced from the user organization’s corporate directory and from data gathered from the web using the People Insights engine and algorithms to extract, label, and structure the data into profiles. Only publicly available data sources are used. Corporate directory information is used where an organization has synced its corporate directory to add to user profiles. Profile data typically can be changed in these ways:

     A user updates their profile

     Corporate directory data is updated

     A company engine makes updates from publicly available information (e.g., news)

Data is stored and encrypted in virtual private cloud datastores and is also encrypted in transit (AES 256 encryption for storage and TLS for transit).

Data management

Cisco understands the importance of privacy and security in any product that handles and displays data on people. To that end, Cisco has taken measures to ensure that all data is stored and processed securely. We also emphasize that the end user should be in control of their data.

Data storage and security

All data, including public and corporate directory data, is encrypted, both in transit and at rest. The public and corporate directory data are stored in separate databases in separate Virtual Private Clouds (VPCs) to ensure that there can be no unintentional overlap or integration of the data sources.

Data encryption keys are managed through the Amazon Web Services (AWS) Key Management System (KMS). The KMS configuration is managed by a restricted set of Cisco Webex engineers. The public and private data sources have separate keys to further ensure secure data separation.

Data is end-to-end encrypted from server to browser.

Directory data will only be shown to other members of the same organization, maintaining data privacy.

User data encryption

Table 1 outlines sources of personal data and how personal data is encrypted.

Table 2 details the retention policy.

Table 1.           Personal data processing and encryption for People Insights

Data processed by

Type of encryption

Public data

  TLS encryption for transist, AES 256 for storage

Directory data

  Keys managed through AWS KMS

User-generated information

 

Table 2.           Data retention policy for People Insights

Type of personal data

Retention period

Criteria for the retention

Publicly available business and professional data

Obtained from public websites – indefinite

Obtained through thirdparty APIs - in accordance with contractual requirements

Publicly available business and professional data is derived from public sources. It is retained indefinitely by default. Upon request, publication and links to source data can be suppressed and restricted from processing.

As publicly available data originates from outside of Cisco, any permanent changes or deletions must be addressed and requested with the primary source.

At the request of users, the data can be archived in order to not appear. This allows for the data to remain permanently hidden rather than re-appearing with a new search after being purged previously.

Directory data

Active subscriptions – at the customer’s discretion

Deactivated accounts - deleted within 30 days

Directory data from People Insights will be hard deleted in the case of deactivation.

Administratiors can deactivate People Insights by toggling of the People Insights feature from Control Hub. Disabling the Active Directory integration will also result in the deactivation of People Insights. Public, non directory data will remain in the People Insights database, but the People Insights feature will no longer appear in your Webex applications.

Non-directory data will remain, with the exception of name and email for users who had only directory data in their profile before deactivation.

User-generated information

Active subscriptions - at the customer’s or user’s discretion

 

 

Deactivated accounts -deleted within 30 days

Users can delete user-generated information (i.e. edits to their employment information) from their profile at any time. Once deleted it will be fully purged from the system.

Profile, edit, and deletion controls

A user can view their public profile from https://people.webex.com/, as shown in Figure 13.

Viewing a user profile

Figure 13.     

Viewing a user profile

Figures 14 and 15 show the user’s directory profile. As this information comes from the organization’s corporate directory, a user may not edit this information. This profile information is available to users within a given organization only.

User directory data

Figure 14.     

User directory data

User directory data continued

Figure 15.     

User directory data continued

Figures 16 and 17 show supplementary profile and company information for the user. This information may be edited and controlled by the user, shown in more detail in the next section.

User role and education information

Figure 16.     

User role and education information

News

Figure 17.     

News

User profile control and edit

A user can control visibility to profile information using the “Hide” button for their entire profile or for portions of their profile. They can also edit their profile name, title, and biographical information, as shown in Figure 18.

Editing a user’s profile

Figure 18.     

Editing a user’s profile

User role details, such as position, title, and timing, can be edited or hidden, as shown in Figure 19.

Editing position information

Figure 19.     

Editing position information

Similarly, a user can edit, add, and control visibility of their education details, as shown in Figure 20.

Editing education details

Figure 20.     

Editing education details

Additional profile and social media links can be edited and controlled, as shown in Figure 21.

Editing links and social media details

Figure 21.     

Editing links and social media details

Finally, visibility to related news can be controlled using the setting shown in Figure 22.

News controls

Figure 22.     

News controls

Editing a profile on people.webex.com

In addition to editing a profile within the Webex Meetings client, you can also edit profile information using the https://people.webex.com portal. A user can see their profile as it would be viewed internally by a co-worker and externally as a public view. On this page a user can also see the data and privacy statement and choose to hide or edit their profile.

Review Your Profile

If a user chooses to edit their profile, they are taken to the edit page where they can edit the following aspects of their profile:

     Profile picture

     Name

     Headline

     Biography

     Current and past positions

     Education

     Links

These profile elements, except for name, can be individually hidden. Additionally, the company news element, which is not editable, can be hidden if desired.

Webex Teams client

People Insights profiles can also be accessed and edited from Webex Teams via the user’s profile.

Webex Teams client

The edit controls from the Webex Teams client will allow the user to edit their profile in people.webex.com, as shown in the previous section.

People Insights Profile

 

People Insights Profile_B

People Insights summary

Providing this level of user editing and control is important to allow users to interact with their profile data. In addition to these control settings, a user may also choose to hide their entire profile or may request full deletion of their profile by opening a case with the Cisco Technical Assistance Center (TAC) or by sending an email message to privacy@cisco.com.

Meeting transcription

Meeting transcription available in Webex Meetings provides a full, searchable transcription of recorded meetings with the ability to jump to any location in the video recording from the transcript. Meeting transcription is provided by Voicea, which was acquired by Cisco. The data handling aspects described in this section refer to the pre-acquisition integration between Webex Meetings and Voicea. Additional information will be added to this section in a future release of this paper to cover the integrated offering and future roadmap for meeting transcription.

Meeting transcription

Meeting transcription data flow

The transcription data flow is shown in Figure 23. When a meeting recording is being processed, the audio is sent securely to Voicea for transcription. The resulting transcript is returned and merged with the meeting recording. On return, a deletion command is issued. No meeting content data is retained by Voicea. The only resulting artifact is the combined recording and transcript in Webex.

Data flow between Webex and Voicea is encrypted in transit and at rest. Over-the-wire encryption uses RSA 2048 bits keys. At rest, Voicea encrypts files using 256-bit Advanced Encryption Standard (AES-256).

Meeting transcription data flow

Figure 23.     

Meeting transcription data flow

Site admin controls

Meeting transcription is disabled by default. Controls for meeting transcription are offered at the site and user level, as shown in Figure 24. Transcription only occurs on recorded meetings.

Site admin controls_Site Information

 

Site admin controls_User Management

Figure 24.     

Site admin controls

Note:       Data handling and privacy is a key design focus post-acquisition, resulting in an even more favorable data privacy posture for customers.

Telemetry and metrics

Log and event telemetry data for diagnostic and metrics purposes are collected from the clients and microservices described in this paper. This is standard operation for cloud-connected services. All telemetry and metrics are stored in the Webex cloud for internal product quality purposes on a per-organization basis. Access to internal telemetry and analytics tools is authentication-controlled. This telemetry data is also processed in order to create analytics and troubleshooting features for customers (administrators), which are accessed via Control Hub.

Examples of internal aggregated metrics for Webex Assistant are shown in Figure 25.

Internal aggregated metrics examples

Figure 25.     

Internal aggregated metrics examples

Summary

While machine learning features are data-driven in nature, it should be clear that Cisco’s Cognitive Collaboration features make every attempt to minimize interaction with customer data. Initial datasets to create features have been sourced and generated by Cisco where supervised machine learning is used. As deeper integration with customer environments become required in the future, this is likely to change. Advances in technology, such as unsupervised machine learning, will also drive the need for change. This paper will be updated to reflect the impact of such technology advances and will discuss expanded product coverage, including customer journey and transcription. For now, with features and products using artificial intelligence and machine learning, the overarching design principle is to not retain customer data for feature accuracy purposes. It is critical that data privacy policy evolves to keep pace with advances in artificial intelligence and machine learning, as additional data may be required for feature operation in the future. Should that become the case, data would be retained only where needed and for the shortest possible time.

Learn more