Cisco MediaSense is
a SIP-based, network-level service that provides voice and video media
recording capabilities for other network devices. Fully integrated into Cisco's
Unified Communications architecture, MediaSense automatically captures and
stores every Voice over IP (VoIP) conversation that transmits over
appropriately configured Unified Communications Manager IP phones or Cisco
Unified Border Element devices. In addition, an IP phone user or SIP endpoint
device can call the MediaSense system directly in order to leave a recording
consisting of media generated only by that user. These recordings can include
video as well as audio, which offers a simple and easy method for recording
video blogs and podcasts.
Recording is accomplished by
media forking, where basically the phone or Unified Border
Element sends a copy of the incoming and outgoing media streams to the
MediaSense recording server. When a call originates or terminates at a
recording-enabled phone, Unified Communications Manager sends a pair of SIP
invitations to both the phone and the recording server. The recording server
prepares to receive a pair of real-time transport protocol (RTP) streams from
the phone. Similarly, when a call passes through a recording-enabled Unified
Border Element, the Unified Border Element device sends a SIP invitation to the
recording server and the recording server prepares to receive a pair of RTP
streams from the Unified Border Element. Finally, under NBR, Communication
Manager sends a pair of SIP invites to the recording server, and a special
message to Unified Border Element, and a pair of RTP streams from the Unified
Border Element to the recording server.
This procedure has several consequences:
Each recording session
consists of two media streams (one for media flowing in each direction). These
two streams are captured separately on the recorder, though both streams (or
tracks) end up on the same MediaSense recording server.
Most Cisco IP phones
support media forking. The IP phones that do not support media forking cannot
be used for phone-based recording.
Though the phones can fork
copies of media, they cannot transcode. This means that whatever codec is
negotiated by the phone during its initial call setup is the codec used in
recording. MediaSense supports a limited set of codecs; if the phone negotiates
a codec that is not supported by MediaSense, the call will not be recorded. The
same is true for Unified Border Element recordings.
The recording streams are
set up only after the phone's primary conversation is fully established, which
could take some time to complete. Therefore, there is a possibility of clipping
at the beginning of each call. Clipping is typically limited to less than two
seconds, but it can be affected by overall Unified Border Element, Unified
Communications Manager, and MediaSense load; as well as by network performance
characteristics along the signaling link between Unified Border Element or
Unified Communications Manager and MediaSense. MediaSense carefully monitors
this latency and raises alarms if it exceeds certain thresholds.
In addition to its primary media recording functionality, MediaSense
offers two other capabilities.
Can play back specific video media files on demand on video phones
or supported players.
This capability supports Video in Queue (ViQ), Video on Demand
(VoD), or Video on Hold (VoH) use cases in which a separate call controller
invites MediaSense into an existing video call in order to play a previously
designated recording. An administrator can upload studio-recorded videos in MP4
format and then configure individual incoming dialed numbers to automatically
play those uploaded videos. The call controller plays the video by sending a
SIP invitation to MediaSense at the dialed number.
Can integrate with Cisco Unity Connection to provide video
Videos are recorded on MediaSense directly by Unity Connection
subscribers and are then played back to their video-capable callers before they
leave their messages.
Because forked media
can be recorded from either a Cisco IP phone or a Unified Border Element
device, MediaSense allows you to record a conversation from different
perspectives. Recordings forked by an IP phone are treated from the perspective
of the phone itself—any media flowing to or from that phone gets recorded. If
the call gets transferred to another phone however, the remainder of the
conversation does not get recorded (unless the target phone has recording
enabled as well). This perspective may work well for contact center supervisors
whose focus is on a particular agent.
Recordings forked by
Unified Border Element are treated from the perspective of the caller. All
media flowing to or from the caller gets recorded, no matter how many times the
call gets transferred inside the enterprise. Even interactions between the
caller and an Interactive Voice Response (IVR) system where no actual phone is
involved are recorded. The only part of the call that is not recorded is a
consult call from one IP phone to another, for example, as part of a consult
transfer. (That can be recorded if Unified Communications Manager is configured
to route IP phone to IP phone calls through a Unified Border Element.) This
perspective works well for dispute resolution or regulatory compliance purposes
where the focus is on the caller.
With Cisco Unified
Communications Manager 10.0 and later, a third option also is available,
Unified Communications Manager Network-Based Recording (NBR). With this option,
the media is forked from a Unified Border Element, managed by Unified
Communications Manager. Unified Communications Manager also offers a fallback
feature, by which Unified Communications Manager fallback to IP Phone forking
if the Unified Border Element forking is unavailable. By pairing these
features, both the caller's and the agent's perspectives can be captured with
the exception of any part of the call that precedes delivery to a Unified
Communications Manager phone.
Ways to Access Recordings
No matter how they
are captured, recordings can be accessed in several ways. While a recording is
still in progress, it can be streamed live (monitored) through a computer that
is equipped with a media player such as VLC or RealPlayer, or one provided by a
partner or third party. Once completed, recordings may be played back in the
same way, or downloaded in raw form by using HTTP. They also may be converted
into .mp4 or .wav files and downloaded in that format. All access to
recordings, either in progress or completed, is through URIs. MediaSense also
offers a web-based Search and Play application with a built-in media player.
The application allows authorized users to select individual calls to monitor,
playback, or download directly from a supported web browser.
Media recordings occupy a significant amount of disk space, so space
management is a significant concern. MediaSense offers two modes of operation
for space management: retention priority and recording priority. These modes
address two opposing and incompatible use cases: one where all recording
sessions must be retained until explicitly deleted (even if it means new
recording sessions cannot be captured) and one where older recording sessions
can be deleted if necessary to make room for new ones. A sophisticated set of
events and APIs is provided for client software to automatically control and
manage disk space.
maintains a metadata database where information about all recordings is
maintained. A comprehensive Web 2.0 API is provided that allows client
equipment to query and search the metadata in various ways, to control
recordings that are in progress, to stream or download recordings, to
bulk-delete recordings that meet certain criteria, and to apply custom tags to
individual recording sessions. A Symmetric Web Services (SWS) eventing
capability enables server-based clients to be notified when recordings start
and stop, when disk space usage exceeds thresholds, and when meta-information
about individual recording sessions is updated. Clients can use these events to
keep track of system activities and to trigger their own actions.
Basic Use Cases
The above mentioned MediaSense capabilities target four basic use
Recording of conversations for regulatory compliance purposes
Capturing or forwarding media for transcription and speech
Capturing of individual recordings for podcasting and blogging
purposes (video blogging).
Playing back previously uploaded videos for ViQ, VoD, VoH, or
video voice-mail greeting purposes.
Compliance recording may be required in any enterprise, but is of
particular value in contact centers where all conversations conducted on
designated agent phones or all calls from customers must be captured and
retained, and where supervisors need an easy way to find, monitor, and play
conversations for auditing, training, or resolving disputes purposes. Speech
analytics engines are served by the fact that MediaSense maintains the two
sides of a conversation as separate tracks and provides access to each track
individually, which simplifies the analytics engine need to identify who is