Cisco MediaSense Design Guide, Release 10.5
Characteristics and Features
Downloads: This chapterpdf (PDF - 1.36 MB) The complete bookPDF (PDF - 5.57 MB) | The complete bookePub (ePub - 2.0 MB) | The complete bookMobi (Mobi - 3.44 MB) | Feedback

Characteristics and Features

Characteristics and Features

Compliance Recording

In compliance recording, calls are configured to always be recorded.

For Unified Communication Manager controlled recording, all calls received by or initiated by designated phones are recorded. Individual lines on individual phones are enabled for recording by configuring them with an appropriate recording profile in Unified Communications Manager. Each line can also be configured as either Network-Based Recording (NBR) preferred or Built-in-Bridge (BiB) recording preferred.

For Unified Border Element dial peer recording, all calls passing through the Unified Border Element that match particular dial peers (typically selected by dialed number pattern) are recorded. MediaSense itself does not control which calls are recorded (except to the limited extent described in Incoming Call Handling Rules).

Compliance recording differs from selective recording because in selective recording, the recording server determines which calls it will record. MediaSense itself does not support selective recording, but the effect can be achieved by deploying MediaSense in combination with certain partner applications.

Recording is accomplished by media forking, where basically the phone or Unified Border Element sends a copy of the incoming and outgoing media streams to the MediaSense recording server. When a call originates or terminates at a recording-enabled phone, Unified Communications Manager sends a pair of SIP invitations to both the phone and the recording server. The recording server prepares to receive a pair of real-time transport protocol (RTP) streams from the phone. Similarly, when a call passes through a recording-enabled Unified Border Element, the Unified Border Element device sends a SIP invitation to the recording server and the recording server prepares to receive a pair of RTP streams from the Unified Border Element. Finally, under NBR, Communication Manager sends a pair of SIP invites to the recording server, and a special message to Unified Border Element, and a pair of RTP streams from the Unified Border Element to the recording server.

This procedure has several consequences:

  • Each recording session consists of two media streams (one for media flowing in each direction). These two streams are captured separately on the recorder, though both streams (or tracks) end up on the same MediaSense recording server.
  • Most Cisco IP phones support media forking. The IP phones that do not support media forking cannot be used for phone-based recording.
  • Though the phones can fork copies of media, they cannot transcode. This means that whatever codec is negotiated by the phone during its initial call setup is the codec used in recording. MediaSense supports a limited set of codecs; if the phone negotiates a codec that is not supported by MediaSense, the call will not be recorded. The same is true for Unified Border Element recordings.
  • The recording streams are set up only after the phone's primary conversation is fully established, which could take some time to complete. Therefore, there is a possibility of clipping at the beginning of each call. Clipping is typically limited to less than two seconds, but it can be affected by overall Unified Border Element, Unified Communications Manager, and MediaSense load; as well as by network performance characteristics along the signaling link between Unified Border Element or Unified Communications Manager and MediaSense. MediaSense carefully monitors this latency and raises alarms if it exceeds certain thresholds.

MediaSense does not initiate compliance recording. It only receives SIP invitations from Unified Communications Manager or Unified Border Element and is not involved in deciding which calls do or do not get recorded. The IP phone configuration and the Unified Border Element dial peer configuration determine whether media should be recorded. In some cases, calls may be recorded more than once, with neither Unified Border Element, Unified Communications Manager, nor MediaSense being aware that it is happening.

The above scenario might occur if all contact center agent IP phones are configured for recording and one agent calls another agent. It might also occur if a call passes through a Unified Border Element dial peer that is configured for recording and lands at a phone that is also configured for recording. The Unified Border Element could end up creating two recordings of its own. However, MediaSense stores enough metadata that a client can invoke a query to locate duplicate calls and selectively delete the extra copy.

At this time, only audio streams can be forked under Unified Communications Manager control, either by BiB or NBR. Unified Border Element dial peer recording can be configured to fork both audio and video, or to fork only the audio tracks in a video call. Videos can also be recorded using the Direct Inbound or Outbound mechanisms of MediaSense.

MediaSense can record calls of up to eight hours in duration.

Conferences and Transfers

MediaSense recordings are made up of one or more sessions where each media forking session contains two media streams: one for incoming and one for outgoing data. A simple call consisting of a straightforward two-party conversation is represented entirely by a single session. MediaSense uses metadata to track which participants are recorded in which track of the session, as well as when they entered and exited the conversation. MediaSense cannot always track this data when conferences are involved.

When sessions included transfer and conference activities, MediaSense tries to retain the related information in its metadata. If a recording gets divided into multiple sessions, metadata is also available to help client applications correlate those sessions together.


A multi-party conference is also represented by a single session with one stream in each direction, with the conference bridge combining all but one of the parties into a single MediaSense participant. There is metadata to identify that one of the streams represents a conference bridge, but MediaSense does not receive the full list of parties on the conference bridge.


Transfers function differently depending on whether the call is forked from a Unified Communications Manager phone or from a Unified Border Element.

With Unified Communications Manager 10.0 and later, any transfer drops the current session and starts a new one. In earlier versions of Unified Communications Manager, if the forking phone is not transferring, then the sessions remains intact.

With Unified Border Element forking, the situation is more symmetric. Unified Border Element is an intermediary network element and neither party is an anchor. Transfers on either side of the device are usually accommodated within the same recording session. (For more information, see Solution-Level Deployment Models.)

Hold and Pause

Hold and pause are two concepts that sound similar, but they are not the same.

  • Hold (and resume) takes place as a result of a user pressing a key on his or her phone. MediaSense is a passive observer.
  • Pause (and resume) takes place as a result of a client application issuing a MediaSense API request to temporarily stop recording while the conversation continues.

The Hold operation differs depending on which device is in control of the forking. In Unified Communications Manager deployments (BiB or NBR recording), one party places the call on hold, blocking all media to or from that party's phone while the other phone typically receives music (MOH). If the forking phone is the one that invokes the hold operation, Unified Communications Manager terminates the recording session and creates a new recording session once the call is resumed. Metadata fields allow client applications to gather together all of the sessions in a given conversation.

If the forking phone is not the one that invokes the hold operation, the recording session continues without a break and even includes the music on hold, if it is unicast (multicast MOH does not get recorded).

For deployments where Unified Communications Manager phones are configured for selective recording, there must be a CTI (TAPI or JTAPI) client that proactively requests Unified Communications Manager to begin recording any given call. The CTI client does not need to retrigger recording in the case of a hold and resume.

For Unified Border Element dial peer deployments, hold and resume are implemented as direct SIP operations and the SIP protocol has no direct concept of hold and resume. Instead, these operations are implemented in terms of media stream inactivity events. MediaSense captures these events in its metadata and makes it available to application clients, but the recording session continues uninterrupted.


The Pause feature allows applications such as Customer Relationship Management (CRM) systems or VoiceXML-driven IVR systems to automatically suppress recording of sensitive information based on the caller's position in a menu or scripted interaction. Pause is invoked by a MediaSense API client to temporarily stop recording, and the subsequent playback skips over the paused segment. MediaSense does store the information in its metadata and makes it available to application clients.

Pause functions identically for Unified Border Element and Unified Communications Manager recording.

Direct Inbound Recording

In addition to compliance recording controlled by a Unified Border Element or a Unified Communications Manager recording profile, recordings can be initiated by directly dialing a number associated with a MediaSense server configured for automatic recording. These recordings are not carried out through media forking technology and therefore are not limited to Unified Border Element or Cisco IP phones, nor are they limited to audio media. In this manner, video blogging is accomplished.

Direct Outbound Recording

Using the MediaSense API, a client requests MediaSense to call a phone number. When the recipient answers, the call is recorded similarly to the way it is recorded when a user dials the recording server in a direct inbound call. The client can be any device capable of issuing an HTTP request to MediaSense, such as a call me button on a web page. Any phone, even a non-IP phone (such as home phone), can be recorded if it is converted to IP using a supported codec. Supported IP video phones can also be recorded in this way.

Direct outbound recording is only supported if MediaSense can reach the target phone number through a Unified Communications Manager system. In Unified Border Element-only deployments where Unified Communications Manager is not used for call handling, direct outbound recording is not supported.


While a recording is in progress, the session is monitored by a third-party streaming-media player or by the built-in media player in MediaSense.

To monitor a call from a third-party streaming-media player, a client must specify a real-time streaming protocol (RTSP) URI that can supply HTTP-BASIC credentials and can handle a 302 redirect. The client can obtain the URI either by querying the metadata or by capturing session events.

MediaSense offers an HTTP query API that allows suitably authenticated clients to search for recorded sessions based on many criteria, including whether the recording is active. Alternatively, a client may subscribe for session events and receive MediaSense Symmetric Web Service (SWS) events whenever a recording is started (among other conditions). In either case, the body passed to the client includes a large amount of metadata about the recording, including the RTSP URI to be used for streaming.

The third-party streaming-media players that Cisco has tested for MediaSense are VLC and RealPlayer. Each of these players has advantages and disadvantages that should be taken into account when selecting which one to use.

Recording sessions are usually made up of two audio tracks. MediaSense receives and stores them that way and does not currently support real-time mixing.

VLC can only play one track at a time. The user can alternate between tracks but cannot hear both simultaneously. VLC is open source and is easy to embed into a browser page.

RealPlayer can play the two streams as stereo (one stream in each ear) but its buffering algorithms for slow connections sometimes results in misleading periods of silence for the listener. People are more or less used to such delays when playing recorded music or podcasts, but call monitoring is expected to be real time and significant buffering delays are inappropriate for that purpose.

None of these players can render AAC-LD, g.729 or g.722 audio. A custom application must be created in order to monitor or play streams in those forms.

MediaSense's built-in media player is accessed by a built-in Search and Play application. This player covers more codecs and can play both streams simultaneously, but it does not support the AAC-LD audio codec, or in some cases, the g.729 codec. These features apply to both playback of recorded calls and monitoring of active calls.

Only calls that are being recorded are available to be monitored. Customers who require live monitoring of unrecorded calls, or who cannot accept these other restrictions, may want to consider Unified Communications Manager's Silent Monitoring capability instead.


Once a recording session has completed, it can be played back on a third-party streaming-media player or through the built-in media player in the Search and Play application. Playing it back through a third-party streaming-media player is similar to monitoring—an RTSP URI must first be obtained either through a query or an event.

Silence Suppression

While recording a call, it is possible to create one or more segments of silence within the recording (for example, by invoking the pauseRecording API). Upon playback, there are various ways to represent that silence. The requesting client uses a set of custom header parameters on the RTSP PLAY command to specify one of the following:

  • The RTP stream pauses for the full silent period, then continues with a subsequent packet whose mark bit is set and whose timestamp reflects the elapsed silent period.

  • The RTP stream does not pause. The timestamp reflects the fact that there was no pause, but the RTP packets contain "TIME" padding which includes the absolute UTC time at which the packet was recorded.

  • The RTP stream compresses the silent period to roughly half a second; in all other respects it acts exactly like bullet 1. This is the default behavior and is how the built-in media player works.

In all cases, the file duration returned by the RTSP DESCRIBE command reflects the original record time duration. It is the time the last packet ended minus the time the first packet began.

The session duration returned by the MediaSense API and session events may differ because these are based on SIP activity rather than on media streaming activity.

Commercial media players such as VLC and RealPlayer elicit the default behavior described in bullet 3. However, these players are designed to play music and podcasts, they are not designed to handle media streams that include silence so they may hang, disconnect, or not seek backward and forward in the stream.

Conversion and Download

Completed recording sessions can be converted on demand to .mp4 or .wav format by using an HTTP request. Files converted in this way carry two audio tracks not as a mixed stream, but as stereo. Alternatively, .mp4 files can also carry one audio and one video track.

After conversion, .mp4 and .wav files are stored for a period of time in MediaSense along with their raw counterparts and are accessible using their own URLs. (The files eventually get cleaned up automatically, but are recreated on demand the next time they are requested.) As with streaming, browser or server-based clients can get the URIs to these files by either querying the metadata or monitoring recording events. The URI is invoked by the client to play or download the file.

As with RTSP streaming, the client must provide HTTP-BASIC credentials and be prepared to handle a 302 redirect. In this way, conversion to .mp4 or .wav format provides a secure, convenient, and standards-compliant way to package and export recorded sessions.

However, large scale conversion to .mp4 or .wav takes a lot of processing power on the recording server and may impact performance and scalability. To meet the archiving needs of some organizations, as well as to serve the purposes of those speech analytics vendors who prefer to download recordings than stream them in real time, MediaSense offers a "low overhead" download capability.

This capability allows clients using specific URIs to download unmixed and unpackaged individual tracks in their raw g.722, g.711, or g.729 format. The transport is HTTP 1.1 chunked, which leaves it up to the client (and the developer's programming expertise) to reconstitute and package the media into whatever format best meets its requirements. As with the other retrieval methods, the client must provide HTTP-BASIC credentials and be prepared to handle a 302 redirect. Note that video streams and AAC-LD encoded audio streams cannot currently be downloaded in this way.

Embedded Search and Play Application

MediaSense provides a web-based tool used to search, download, and playback recordings. This Search and Play application is accessed using the API user credentials.

The tool searches both active and past recordings based on metadata characteristics such as time frame and participant extension. Recordings can also be selected using call identifiers such as Cisco-GUID or Unified Communications Manager call leg identifier. Once recordings are selected, they may be individually downloaded in mp4 or .wav format or played using the application's built-in media player.

The Search and Play tool is built using the MediaSense REST-based API. Customers and partners interested in building similar custom applications can access this API from the DevNet (formerly known as the Cisco Developer Network).

Support for the Search and Play application is limited to clusters with a maximum of 400,000 sessions in the database. Automatic pruning provides the capability to adjust the retention period to ensure that this limitation is respected using the following formula:

Retention Setting in Days = 400,000 / (avg # agents * avg # calls per hour * avg # hours per day)

For example, if you have 100 agents taking 4 calls per hour, 8 hours per day every day, you can retain these sessions for 125 days before exceeding the 400,000 session limit. This is acceptable for most customers, but if you have 1000 agents taking 30 calls per hour, 24 hours per day every day, your retention period is about half a day. The Search and Play application cannot be used in this kind of environment.


Additional reasons for limiting the retention period are described in Scalability and Sizing.

Embedded Streaming Media Player

Telephone recording uses a different set of codecs than those typically used for music and podcasts. As a result, most off-the-shelf media players are not well suited to playing the kind of media that MediaSense records. This is why partner applications generally provide their own media players, and why MediaSense has the built-in Search and Play application.

The embedded player supports g.729, g.711, and g.722 codecs, which applies to both playing back of recorded calls and monitoring of active calls. However, g.729 is not supported for Microsoft Windows-based 64-bit Java installations.

The embedded media player can be accessed through the Search and Play application or it can be used by a third party client application. This application can present a clickable link to the user that loads the recording-specific media player for the selected recording session into the user's browser. The link allows partners who do not have sophisticated user interface requirements to avoid the complexity of either developing their own media player or incorporating an off the shelf media player into their applications.

Uploaded Videos to Support ViQ, VoD and VoH Features

MediaSense supports the Cisco Contact Center Video in Queue, Video on Demand, and Video on Hold features by enabling administrators to upload .mp4 video files for playback on demand.

To use these features, users must perform the following steps:

  1. Produce an .mp4 video that meets the technical specifications outlined in below steps.

  2. Upload the .mp4 video to the MediaSense Primary node. The video is automatically converted into a form that can be played back to a supported video endpoint and distributed to all other nodes. Playback is automatically load balanced across the cluster.

  3. Create an "incoming call handling rule" that maps a particular incoming dialed number to the uploaded video. You may also specify whether this video should be played once or repeated continuously.

Administrative user interfaces are provided for uploading the file to MediaSense and creating the incoming call-handling rule. These functions are not available through the MediaSense API.

An .mp4 file is a container that can contain many different content formats. MediaSense requires that the file content meet the following specifications:

  • The file must contain one audio track and one video track.
  • The video must be encoded using the H.264.
  • The audio must be encoded using AAC-LC.
  • The audio must be monaural.
  • The entire .mp4 file size must not exceed 2GB.

The preceding information is known as the Studio Specification. It must be provided to any professional studio that is producing video content for this purpose. Most commonly available consumer video software products can also produce this format.


Video resolution and aspect ratio are not enforced by MediaSense. MediaSense play back whatever resolution it finds in an uploaded file, so it is important to use a resolution that looks good on all the endpoints on which you expect the video to be played. Many endpoints are capable of up- or down-scaling videos as needed, but some (such as the Cisco 9971) are not. For the best compatibility with all supported endpoints, use standard VGA resolution (640x480).

Cisco endpoints do not support AAC-LC audio (which is the standard for .mp4), so MediaSense automatically converts the audio to AAC-LD, g.711 µlaw, and g.722 (note that g.711aLaw is not supported for ViQ/VoH). MediaSense automatically negotiates with the endpoint to determine which audio codec is most suitable. If MediaSense is asked to play an uploaded video to an endpoint which supports only audio, then only the audio track is played.

A Cisco IOS 3945E gateway with 4 GB RAM is required to run ViQ load (120 calls in a queue). The gateway will crash if RAM is less than 4 GB.

Video playback capability is supported on all supported MediaSense platforms, but there are varying capacity limits on some configurations. See Hardware Profiles for details.

MediaSense comes with a sample video preloaded and preconfigured for use directly out of the box. After successful installation or upgrade, dial the SIP URL sip:SampleVideo@<mediasense-hostname> from any supported endpoint or from Cisco Jabber Video to see the sample video.

Integration with Cisco Unity Connection for Video Voice-Mail

Beginning with Cisco Unity Connection release 10.0(1), configured subscribers have the option to record video greetings in addition to audio greetings. Subscribers who are configured to record video greetings and who are calling from a video-capable IP endpoint are presented with additional prompts to record their video greeting. These recordings (both the audio and video tracks) are stored and played back from MediaSense. A separate audio-only copy of the recording remains on Cisco Unity Connection as well.

If for any reason Cisco Unity Connection cannot play a video greeting from MediaSense, it reverts to its locally stored audio greeting.

This is an introductory implementation, and it contains these limitations:

More information about the Cisco Unity Connection integration, including deployment and configuration instructions, can be found in the Unity Connection documentation.

Integration with Finesse and Unified CCX

MediaSense is integrated with Cisco Finesse and Unified Contact Center Express (Unified CCX). The integration is both at the desktop level and at the MediaSense API level.

At the desktop level, MediaSense's Search and Play application has been adapted to work as an OpenSocial gadget that can be placed on a Finesse supervisor's desktop. In this configuration, MediaSense can be configured to authenticate against Finesse rather than against Unified Communications Manager. Therefore, any Finesse user who has been assigned a supervisor role can search and play recordings from MediaSense directly from his or her Finesse desktop. (A special automatic sign-on has been implemented so that when the supervisor signs in to Finesse, he or she is also automatically signed into the MediaSense Search and Play application.) Other than this sign-in requirement, there are currently no constraints on access to recordings. Any Finesse supervisor has access to any and all recordings.

At the API level, Unified CCX subscribes for MediaSense recording events and matches the participant information it receives with the known agent extensions. It then immediately tags those recordings in MediaSense with the agentId, teamId, and if it was an ICD call, with the contact service queue identifier (CSQId) of the call. This subscription allows the supervisor, through the Search and Play application, to find recordings that are associated with particular agents, teams, or CSQs without having to know the agent extensions.

This integration uses BiB or NBR forking, selectively invoked through JTAPI by Unified CCX. Because Unified CCX is in charge of starting recordings, it is also in charge of managing and enforcing Unified CCX agent recording licenses. However, other network recording sources (such as unmanaged BiB forking phones or Unified Border Element dial peer forking sources) could still be configured to direct their media streams to the same MediaSense cluster, which could negatively impact Unified CCX's license counting.

For example, Unified CCX might think it has 84 recording licenses to allocate to agent phones as it sees fit, but it may find that MediaSense is unable to accept 84 simultaneous recordings because other recording sources are also using MediaSense resources. This management also applies to playback and download activities—any activity that impacts MediaSense capacity. If you are planning to allow MediaSense to record other calls besides those that are managed by Unified CCX, then it is very important to size your MediaSense servers accordingly.

More information about this integration, including deployment and configuration instructions, can be found in the Unified CCX documentation.

Integration with Unified Communications Manager for Video on Hold and Native Queuing

Starting with Unified Communications Manager Release 10.0, customers can configure a Video on Hold source for video callers, similar to a Music on Hold source that is used for audio callers. The same facility is used to provide pre-recorded video to callers who are waiting for a member of a hunt group to answer. This is known as "Unified Communications Manager native queuing."

MediaSense can be used as the video media server for both purposes. To use MediaSense in this way, administrators make use of the product's generic ability to assign incoming dialed numbers to various uploaded videos, which are then played back when an invitation arrives on those dialed numbers. Unified Communications Manager causes one of these videos to play by temporarily transferring the call to the corresponding dialed number on MediaSense.

See Uploaded Videos to Support ViQ, VoD and VoH Features for more information.

For instructions on configuring these features in Unified Communication Manager, see the relevant Unified Communications Manager documentation available at http:/​/​​c/​en/​us/​support/​unified-communications/​unified-communications-manager-callmanager/​tsd-products-support-series-home.html.

Integration with Cisco Remote Expert

MediaSense integrates with the Cisco Remote Expert product in two areas:

  • It can act as a video media server for ViQ, VoH, and Video IVR.

  • It can record the audio portion of the video call, or the entire video call.

MediaSense's video media server capabilities satisfy Remote Expert's needs for ViQ, VoH, and Video IVR. See Uploaded Videos to Support ViQ, VoD and VoH Features for more information.

Calls that are to be recorded must be routed through a Unified Border Element device that is configured to fork its media streams to MediaSense (because most of the endpoints used for Remote Expert are not able to fork media themselves). All the codecs listed in Codecs Supported are supported. Consult the Compatibility Matrix to ensure that your Unified Border Element is running a supported version of Cisco IOS, to ensure that you incorporate several bug fixes in this area.

Remote Expert provides its own user interface portal for finding and managing recordings, and for playing them back. For AAC-LD audio calls (most common when using EX-series endpoints), there are no known RTSP-based AAC-LD streaming media players, so those calls can only be converted to .mp4 and downloaded for playback. Live monitoring of such calls is not possible.

For more information about this integration, including deployment and configuration instructions, see the Remote Expert documentation.

Incoming Call Handling Rules

When MediaSense receives a call, it needs to know what action to take. MediaSense offers various options to configure what action it takes for a call type. The following actions are available:

  • Record audio of the incoming calls

  • Record audio and video of the incoming calls

  • Play an outgoing media file once

  • Play an outgoing media file continuously

  • Reject incoming calls

If your application is to record calls forked by a Unified Border Element dial peer, then the dialed number in question is configured as the "destination-pattern" setting in the dial peer which points to MediaSense. If your application is to record calls forked by a Unified Communications Manager phone or NBR, then the dialed number in question is configured as the recording profile's route pattern.

Call Association

MediaSense generates multiple sessions for a call being recorded in case of hold/resume or transfer, which makes it difficult for users (like Supervisors) to identify all the recording sessions in a single call. MediaSense 10.5 has a new Expand Call icon in the Search and Play application, to view, play, and download all the associated sessions of a call (both active and recent) in the Associated Sessions box. Currently, MediaSense groups only strongly associated calls which have at least one common xRefci value.

Note: MediaSense 10.5 supports call association for Built-in-Bridge recordings only.


Using MediaSense, you can archive audio recordings, video recordings, and video greeting's recordings to an offline location. To archive the recordings, specify the archive configuration settings on the MediaSense Archive Configuration window (Cisco MediaSense Administration > Administration > Archive Configuration). As a result, you can save the recordings for a long duration and prevent the recordings from getting pruned automatically. For more information on archive configuration settings, see the MediaSense User Guide at http:/​/​​c/​en/​us/​support/​customer-collaboration/​mediasense/​products-user-guide-list.html .

Figure 1. MediaSense Archive Configuration

Archival is performed in two steps:

  1. Converting a session to .mp4 file

  2. Copying .mp4 and metadata files to a specified SFTP server

Each recording session consists of two files, .mp4 and .json. The MP4 version of the recording is named as <SessionId>.mp4. The .mp4 file can be an audio, video, or video greeting. The JSON rendition of the metadata stored in the MediaSense database is named as <SessionId>.json. The .json file is a plain text file and text search tools can be used to search for these files.


    "callControllerIP": "",
    "callControllerType": "Cisco-CUCM",
    "sessionDuration": 12319,
    "sessionId": "214ef2fbc3b41",
    "sessionStartDate": 1438595662982,
    "sessionState": "CLOSED_NORMAL",
    "tracks": [
            "codec": "PCMA",
            "downloadUrl": "",
            "participants": [
                    "deviceId": "SEPF0292958FA6D",
                    "deviceRef": "1013",
                    "isConference": false,
                    "participantDuration": 12319,
                    "participantStartDate": 1438595662982,
                    "xRefCi": "21528379"
            "trackDuration": 12319,
            "trackMediaType": "AUDIO",
            "trackNumber": 1,
            "trackStartDate": 1438595662982
            "codec": "PCMA",
            "downloadUrl": "",
            "participants": [
                    "deviceId": "SEP0021CCCEEE2F",
                    "deviceRef": "1438",
                    "isConference": false,
                    "participantDuration": 12319,
                    "participantStartDate": 1438595662982,
                    "xRefCi": "21528380"
            "trackDuration": 12319,
            "trackMediaType": "AUDIO",
            "trackNumber": 0,
            "trackStartDate": 1438595662982
    "urls": {
        "httpUrl": "",
        "mp4Url": "",
        "rtspUrl": "rtsp://",
        "wavUrl": ""

Archival Directory Structure

MediaSense Archive directory structure is based on the date of the recording (in <yyyymmdd> format). MediaSense creates a directory for each day and archives the recordings in a chronological order.

Directory: /home/sftp/<hostname>/<yyyymmdd>

Contents: Directory 20150612 is created with two files for each recording.

-rw-rw-r-- 75860 Jun 12 03:56 314de63e83471.mp4

-rw-rw-r-- 1209 Jun 12 03:56 314de63e83471.json

File date and time show when the session was archived.

"314de63e83471" is the SessionId, a system-generated identifier for a session and is unique across all MediaSense servers.

Unified Communications Manager Network-Based Recording

With Unified Communications Manager Network-Based Recording (NBR), you can use a gateway to record calls. NBR allows the Unified Communications Manager to route recording calls, regardless of device, location, or geography. With NBR, call recording media can be sourced from ether the IP phone or from a gateway that is connected to the Unified Communications Manager over a SIP trunk. Unified Communications Manager dynamically selects the right media source based on the call flow and call participants.

MediaSense supports Unified Communications Manager NBR for IP to IP media forking using a Unified Border Element.


MediaSense does not support NBR for TDM to IP media forking and for calls treated by Unified CVP.

Unified Communications Manager configuration for NBR provides the fallback capability, especially in cases where a phone is configured to use the preferred source (either NBR or BiB) for a call recording. Unified Communications Manager attempts to follow the preference, however, in case it cannot do the preferred recording, it will fall back to the alternative automatically. So, it is simple to configure a phone to record both the caller's as well as the agent's perspective. With NBR-preferred recording, all the calls are forked from the router using NBR; however, agent-to-agent consult calls are also recorded by the BiB. All of these call segments can be associated together because both NBR and BiB use the xRefCi style of recording session identification.

NBR is the recommended forking feature, which provides these benefits:

  • NBR offers both network-based Unified Border Element recording and simple BiB forking.
  • NBR offers an automatic fallback to BiB when the Integrated Services Routers (ISR) are unavailable as no separate recording configuration is required. This is useful in cases where customers want to include agent-agent consult calls in the recording policies as Unified Border Element cannot record consult calls, so BiB needs to be enabled separately.
  • Both NBR and BiB calls can be correlated using xRefci, which is available from Unified Communications Manager JTAPI; CISCO-GUID is not needed, which means neither CTI Server nor CTIOS connections are required.
  • Because there is a single correlation identifier, correlation across components is stronger and can be done in a uniform way independent of the call flow.
  • Using NBR, TDM gateway recording is automatically used without splitting the capacity of the router.


    TDM gateway recording is not supported with MediaSense 10.5(1).
  • Using NBR, directly-dialed as well as dialer-initiated outbound calls can be correlated with their appearance in other solution components.
Table 1 Differences Between NBR, BiB, and Unified Border Element Dial Peer Forking
  NBR Forking BiB Forking Unified Border Element Dial Peer Forking

Media Forking

Sends the media streams from an ISR to MediaSense

Sends media stream directly from the phone to MediaSense (significant in case of network bandwidth requirements)

Sends the media streams from an ISR to MediaSense

SIP Signaling

Unified Communications Manager to MediaSense

Unified Communications Manager to MediaSense

ISR to MediaSense

Media Types

Fork only audio media.

Fork only audio media.

Fork, both audio and video media (specifically to MediaSense 10.5).

Record IVR Interaction

Record calls that reach a Unified Communications Manager phone.

Record calls that reach a Unified Communications Manager phone.

Record calls as well as IVR interaction even if the call never reaches a phone.

Recording Perspective

Record calls from the caller's perspective.

Record calls from the forking phone's perspective.

Record calls from the caller's perspective.

Recording a call as a single session or multiple sessions in case of hold/resume, or transfer

New sessions for a call are triggered in case of hold/resume as well as transfer. Beginning with Unified Communications Manager 10.0, a new session is also triggered if a call is transferred away from the far end phone which is not the one forking the media.

New sessions for a call are triggered in case of hold/resume as well as transfer. Beginning with Unified Communications Manager 10.0, a new session is also triggered if a call is transferred away from the far end phone which is not the one forking the media.

An entire call is always recorded as a single session except in cases where the codec changes during the life of a call.

Call Correlation

You should look in MediaSense for sessions that have an xRefCi value that matches the Unified Communications Manager Call ID for the various segments of the call in question. This value is available through Unified Communications Manager JTAPI and Unified Communications Manager CDR records.

You should look in MediaSense for sessions that have an xRefCi value that matches the Unified Communications Manager Call ID for the various segments of the call in question. This value is available through Unified Communications Manager JTAPI and Unified Communications Manager CDR records.

You should look in MediaSense for sessions that have a CCID value which matches the Cisco-GUID of the call in question.

NBR does not resolve the following issues:

  • Unwanted duplicate recordings in agent-agent consult calls when both agents have recording enabled.
  • A call splits into multiple recording sessions, and the number increases beginning with Unified Communications Manager 10.0.

NBR is not recommended in these cases:

  • Recording the IVR activity (requires Unified Border Element Dial Peer forking)
  • Recording the forked video (requires Unified Border Element Dial Peer forking)
  • Customer does not want to upgrade to Unified Communications Manager 10.0
  • Customer uses some other call controller instead of Unified Communications Manager