Cisco® Live is Cisco's annual premier education and training event for IT, networking, and communications professionals. Cisco Live 2010, Barcelona was held at the Centre Convencions Internacional Barcelona from January 26 to January 28 with a total attendance of more than 3,000 people.
Unlike a traditional corporate network, the Cisco Live network is built up and fully operational in a matter of days. The Cisco Live network offers many advanced services including Cisco TelePresence™, video surveillance, IPv6, and voice over IP. The network provides connectivity for the various Cisco and partner technology demonstrations, labs, streaming video of the technical sessions, and wireless access for all the attendees. With so many activities dependent on the network, the availability, reliability, and performance of the network are crucial to the success of the event. Hence, from a network operations perspective, visibility into network usage and performance are critical.
Cisco® Network Analysis Module (NAM) SVC-NAM-2 service module running version 4.1 software was installed in the core Cisco Catalyst® 6500 Series Switch to deliver granular traffic analysis, rich application performance measurements, comprehensive voice quality monitoring, and deep insightful packet captures to help monitor and troubleshoot network performance.
Cisco NAM Setup
The Cisco Live network had 42 VLANs configured. There were different VLANs for users, partners, demonstrations, labs, voice, wireless, management, and so on. Traffic from all 42 VLANs with a total of about 300 Mbps was set up to connect to the NAM using Switch Point Analyzer (SPAN) for analysis using the integrated data source configuration menu available in the NAM web-based graphical user interface (GUI). Various monitoring capabilities, such as core monitoring, voice and RTP steam monitoring, response time monitoring, Differentiated Services (DiffServ) monitoring, URL monitoring, and chassis parameters (switch health and port statistics) monitoring was enabled on the NAM. All this setup took less than 10 minutes and made the NAM ready to begin monitoring the network.
Traffic Analysis with Cisco NAM
The NAM overview screen provided a real-time view into who was using the network, which applications they were using, and how much network resources were being consumed. An initial look at the NAM traffic overview screen indicated that RTP, HTTP, and RTSP traffic was consuming the most bandwidth in the network core. Additionally the most active hosts in the network were identified as belonging to the 10.31.x.x and 10.32.x.x subnets, which included the servers hosting the Cisco Live content (Figure 1).
Figure 1. NAM Traffic Overview
Apart from looking at the traffic mix and top talkers in real time, predefined top-N historical reports revealed the network usage pattern though the course of the event. A peak usage of about 250 Mbps was observed (Figure 2).
Figure 2. Top-N Applications Over Time
To get a deeper look at the HTTP traffic, Monitor > Apps > URLs was selected and the URLs were sorted by maximum hits. As expected the Cisco Live content and registration servers had the most hits due to people checking into the event as well as searching for sessions and viewing online content (Figure 3). To track this usage more accurately, URL-based applications were created for Cisco Live and Cisco Live Registration URLs (Figure 4). Additionally, as observed, the next most popular websites were Facebook and BBC for football scores. A URL-based application was created for Facebook as well to track bandwidth utilization.
Figure 3. Cisco NAM URL Monitoring
Figure 4. Cisco NAM URL-Based Applications
To gain visibility into traffic volume per VLAN, Monitor > VLAN was selected and the VLANs were sorted by bits/s. VLANs 23 and 34 were the most heavily used VLANs (Figure 5). VLAN 34 was the Cisco Live registration VLAN and VLAN 23 was a demonstration VLAN. To understand the traffic mix for VLAN 23, it was added as a separate data source. The traffic mix revealed that most of the traffic in VLAN 23 was RTP (Figure 6). Looking at the details, the hosts originating the RTP traffic were identified as the video servers streaming in high definition (HD) mode. Thresholds were set in the NAM to alert the network operations center for the event when RTP traffic consumed more than 100 Mbps bandwidth, in which case the operators could request the demonstrations to reduce streaming resolution. Syslog alerts and Simple Network Management Protocol (SNMP) traps were set up to be generated to notify CiscoWorks LAN Management Solution (LMS), which was acting as the centralized fault management system in the event of threshold violation (Figure 7).
Figure 5. VLAN Monitoring
Figure 6. Traffic Analysis per VLAN
Figure 7. NAM Alert Integration with CiscoWorks LMS
IPv6 NetFlow Monitoring
Although the core of the network at Cisco Live Barcelona ran IPv4, part of the network used IPv6 for demonstrating specific functionality. Remote NetFlow monitoring capability of the NAM was utilized to gain insights into the IPv6 traffic. The remote router was configured to export NetFlow version 9 data to the NAM, so that the NAM could monitor the IPv6 traffic flow (Figure 8).
Figure 8. NAM Monitoring IPv6 NetFlow Version 9 Traffic Records
Application Response Time Monitoring
Cisco NAM can look at TCP client/server messages and determine more than 40 transaction-based statistics, such as application server delay, network delay, transaction time, retransmission delay, and so on, that provide valuable information for monitoring the performance of TCP-based applications. Through traffic analysis, HTTP had been identified as the most heavily used Transmission Control Protocol. Through URL monitoring, the Cisco Live content hosting servers were identified as receiving the highest hits. A look at Monitor > Response Time, sorted by number of clients, further verified this information (Figure 9).
Figure 9. Server Response Time Monitoring
Since the server 10.32.128.14 had the most number of clients, it required more careful monitoring to observe response time trends and catch any performance issues before they started affecting user experience. Historical trending reports for average application delay, average client network delay, and average server network delay were created (Figure 10).
Figure 10. Server Response Time Trending Report
As noted, toward the end of January 25 a network issue affected the response time of the server significantly. The response time, which was averaging around 30 msec, shot up to 160 msec. The time of this corresponded to a power outage on location. A detailed look at the various transaction-based statistics for this server indicated a significant packet drop in the network based on the bytes retransmitted metric (Figure 11).
Figure 11. Detailed Server Response Time Metrics
In order to more proactively monitor the server response time, thresholds were set and alerts sent to CiscoWorks LMS.
Voice Quality Monitoring
Cisco NAM provides visibility into the quality of voice calls based on voice signaling protocols as well as RTP stream monitoring. The metrics are calculated every 3 seconds and averaged over a minute for reporting. The metrics include Mean Opinion Score (MOS), packet loss, jitter, seconds of severe concealment, and so on. These metrics are also exported to Cisco Unified Service Monitor for integration into the Cisco Unified Communications Management Suite.
At Cisco Live Barcelona, a number of IP phones were set up in the lobby to help attendees to stay connected. The Cisco NAM was monitoring the voice VLANs and provided real-time visibility into the quality of voice calls (Figure 12).
Figure 12. Voice Quality Monitoring for Active Voice over IP Calls
The NAM also enables a more detailed look at the worst phone calls, the various metrics, as well as start and end times of calls, to help troubleshoot voice quality issues (Figure 13). Note also visibility into Skype calls.
Figure 13. Worst N Phone Calls
The Cisco NAM phones report keeps track of the phones in the network and provides visibility into the last N phone calls made from each phone to provide insight into issues with specific equipment (Figure 14).
Figure 14. Phones Report
The call quality distribution report provides visibility into the overall call quality in the network. As seen in Figure 15, about 64 percent of the calls were of excellent quality, 34 percent of good quality.
Figure 15. Call Quality Distribution Report
Voice quality alerting was provided by Cisco Unified Operations Manager based on the data feed from the NAM (Figure 16). To further troubleshoot RTP stream issues, navigation back into the NAM from Cisco Unified Operations Manager was set up (Figure 17).
Figure 16. Call Quality Alert in Cisco Unified Operations Manager
Figure 17. RTP Stream Monitoring for Voice and Video Streams
Cisco NAM can examine the DiffServ and type of service (TOS) bits within IP packets and classify the packets based on DiffServ profiles. Each category can be examined for traffic volume and applications and hosts sending traffic with specific markings, which helps in verifying quality of service (QoS) planning assumptions.
At Cisco Live Barcelona, a DiffServ profile was created for voice RTP and voice signaling as well as for the Cisco Live registration server (Figure 18). However, as seen, most of the traffic was best effort in this network, which worked fine due to abundant bandwidth availability.
Figure 18. DiffServ Profile
At Cisco Live Barcelona, since the NAM was placed in the core Catalyst 6500 Series Switch, the NAM was able to provide visibility into the health of the switch including CPU and memory utilization (Figure 19), as well as port and error statistics.
Figure 19. Switch Health Monitoring
Managing the Cisco NAM with CiscoWorks LMS
CiscoWorks LMS was set up to manage all the devices at Cisco Live Barcelona, including the Cisco NAM. CiscoWorks LMS managed the inventory and configuration of the NAM, consolidated the syslogs and alerts received from the NAM (Figure 20), and provided visibility into the NAM through the centralized portal (Figure 21) by using NAM's web publishing feature.
Figure 20. CiscoWorks LMS Managing Cisco NAM
Figure 21. Cisco NAM Portal in CiscoWorks LMS
Cisco NAM provided real-time monitoring for the network at Cisco Live 2010, Barcelona. Cisco NAM helped ensure exceptional network performance by providing visibility into all data, voice, and video traffic, as well as into key performance indicators. NAM's click-of-a-button troubleshooting capabilities provided the necessary tools to improve Mean Time to Repair (MTTR) for any network issues. Cisco NAM was integrated with Cisco Unified Service Monitor and CiscoWorks LMS for end-to-end manageability of the entire network.