Decision-Making Criteria Using Cisco MATE Collector and Cisco MATE Design and Their Impact on Backbone Design
What You Will Learn
The combination of over-the-top (OTT) traffic with locally sourced content is presenting new challenges in backbone design and maintenance. Many business rules must be considered when customers are legitimately receiving content from competitors beyond the scope of policy rules in traditional routing protocols, such as Border Gateway Protocol (BGP). Delivering your own content as well as content that originates outside your network increases the complexity of planning the backbone. This combination also presents a greater need to consider the dynamics of shifting traffic patterns under worst-case and failure conditions.
Business relationships and their impact on backbone design are leading to a heightened interest in business intelligence (BI) because you need to know the clear cost of peering - beyond just ingress and egress traffic. In addition, you need to be able to model potential traffic and derive traffic engineering solutions when unexpected traffic enters your network. These requirements can be met only after you understand the traffic matrix behavior, topology, and infrastructure. This essential information is uniquely available with the Cisco™ MATE™ Collector. Supported by an accurate traffic matrix and a set of actionable information, Cisco MATE Design's infrastructure analysis can help identify misplaced traffic, and accurate optimization tools can be used to develop traffic engineering solutions when needed.
Network Modeling Workflow
To model networks so they can assist with business and technical issues requires building a holistic network view that addresses the needs of operators, network engineers, and business stakeholders. The elements that must be gathered to create this view include an up-to-date traffic matrix, automated infrastructure discovery, and accurate topology information. To take action based on this abstracted network requires a business model that ties together the traffic, infrastructure, and commercial frameworks.
The process of network discovery using Cisco MATE Collector software includes gathering topology information from interior gateway protocols, such as Intermediate System-to-Intermediate System (ISIS) and Open Shortest Path First (OSPF), as well as other protocol information derived from BGP and multicast databases. Traffic from Simple Network Management Protocol (SNMP)-polled interface statistics, Flow Collection information, and other management databases is also gathered to provide a current snapshot of the infrastructure.
Figure 1 illustrates a three-step continual process that includes information gathering and simulation activities in capacity planning. First, Cisco MATE Collector gathers the traffic demands, the physical and logical topologies, and the network configuration. This information is run through a routing model in Cisco MATE Design. When the network is reprovisioned, the process starts again.1
Figure 1. Modeling the Core Network for Capacity Planning2
With an accurate traffic matrix and predictable failure rules, a continual simulation, rapid simulation analysis, optimization tools, and an archive of past results for comparison provide you with an arsenal to correct problems when they occur.
Importance of the Traffic Matrix
A traffic matrix is a list of demands, where a single demand represents a potential flow of traffic from a source through to a destination. Generating and maintaining an accurate traffic matrix is a key part of planning, designing, and monitoring efficient and effective IP/Multiprotocol Label Switching (MPLS) networks. However, generating a traffic matrix is also difficult for a variety of reasons.
Traditional means of creating traffic matrices can often be inaccurate in the following circumstances.
• If BGP neighbors do not follow conventional routing rules (longest/shortest exit).
• If nodes are trimmed so the traffic you have better matches the network you are modeling. This strategy can affect the accuracy of the model, because the traffic you have collected is for a larger and different network than the one you are modeling.
• If an autonomous system pushes ingress traffic on your network in unexpected locations.
Most traffic collectors have a steep learning curve and are complex to administer and expensive to scale.
How Content Delivery Networks Affect Traffic
Traffic patterns are changing because of large-scale content delivery network (CDN) aggregation. Traditional methods of resolving peering arrangements in these situations, such as longest exit (respect BGP Multi-Exit Discriminator [MEDs]), are typically not enough.
Other localization methods are often brought into the mix, such as BGP communities, IP address geo-location, DNS resolution, and static mappings. With this battery of tools, steady-state operations are usually fine, but surprises can inevitably occur under failure.
OTT video consumes an unpredictable amount of external bandwidth, and it is not uncommon for port utilization to increase from 50 to 90 percent in a short period of time. As the popularity of OTT increases, use of these services will correspondingly increase. Service providers carrying content, especially a combination of their own and OTT, need to be wary of what might happen under a failure state. In addition, they must be prepared if a CDN chooses not to localize content.
Because most major networks now have CDNs, delivery of this traffic becomes a key factor for service providers, in terms of both the CDN cost and the network cost. Effective network planning involves considering the effects of failure analysis and creating a network that can absorb short-term growth. The increase in these hybrid models of local and elsewhere-originated content is making forecasting more difficult during initial buildouts. Planning and designing difficulties get even more severe as product offerings in the form of cloud-based applications evolve more rapidly. It's hard for the network augmentation cycle to keep pace. And if peak traffic between multiple content sources coincides, as they might with second-screen viewing technology, predictability is that much tougher.
MATE Collector Flow Collection
Among traffic, topology, and infrastructure, traffic has been the most difficult to characterize. To address this challenge, Cisco MATE Collector introduces the Flow Collection option, which builds a demand traffic matrix for IP and MPLS network planning and traffic engineering. Flow Collection maps a flow's source and destination IP addresses to ingress and egress routes in the network.
Figure 2 shows the workflow for collecting flow data from the external network and the Flow Collection server.
Figure 2. Flow Collection in MATE Collector
The Cisco MATE Collector tools integrate with an external configuration table and a state-recording snapshot process. The end result is a Cisco MATE plan file containing flow-based demands.3
Use Case: Balancing Content from Multiple Networks
Planning backbone networks can be incredibly difficult when ingress traffic is unpredictable. You can use Cisco MATE Design and MATE Collector to identify when, where, and in what quantities ingress traffic originates from a CDN.
For example, a Europe-based service provider consists of a backbone network and four regional delivery networks: Northwest, Northeast, Southwest, and Southeast. An external CDN peers with this network from locations in Istanbul, Madrid, Moscow, and Stockholm. Each peering location is supposed to source the nearest regional network. This network and the peering locations are shown in Figure 3.
Figure 3. View of Service Provider Backbone Network (Left) and CDN Peering Locations (Right)
Handling Internal Failures That Affect Multiple Networks
A fiber cut between Istanbul and Cairo has occurred. Because fiber cuts can take several days or more to repair, traffic engineering is needed in the meantime. The backbone network has been designed to withstand a fiber cut, and simulation of this failure has determined that the network can continue to uphold all service requirements. Some interfaces are running warm, and no core interface is more than 82 percent utilized (Figure 4).
Figure 4. Service Provider Fiber Cut Between Istanbul and Cairo
Identifying Peers Not Upholding Agreements
One hour later, the interface between Istanbul and Rome suddenly reaches a utilization level past 100%, indicating probable packet loss (Figure 5).
Figure 5. Fiber Cut Between Istanbul and Cairo with High Interface Utilization
What caused this increased congestion? Using Cisco MATE Design, you can quickly identify the interface from the content provider in Istanbul (Figure 6).
Figure 6. Interface from CDN to Service Provider Backbone Network
Looking at past plan files, you can see that the content provider usually sources the Northeast and Southeast regional networks from Istanbul (Figure 7).
Figure 7. Usual Demands from CDN on Istanbul Interface
However, after viewing the demands on the interface, it is evident that the peer is now sourcing demands to the Southwest regional location as well, probably because it is responding to latency increases caused by the fiber cut (Figure 8).
Figure 8. Actual Demands from CDN on Istanbul Interface
A network operations team could be confounded by this sudden congestion. Network operations often ends up chasing traffic to develop on-the-spot traffic engineering solutions without addressing the underlying issues of unauthorized ingress traffic.
Traditional methods of solving congestion involve Interior Gateway Protocol (IGP) metrics or Label Switched Path (LSP) optimization. However, by changing metrics or optimizing LSPs, operations might unknowingly place other areas of the network at risk. The actual service provider issue is dialogue with the CDN. If a service provider knew ahead of time where the CDN was pouring excess traffic into the network, the amount of traffic, and the traffic location, traffic engineering solutions could be developed much more readily.
Direction of Business Intelligence for Inter-Provider Communication
Figure 9 illustrates the CDN's change in peering locations. The former demand path is shown with a solid blue arrow, and the new demand path is shown with a dashed arrow.
Figure 9. CDN Causing Congestion and Increasing the Distance the Backbone Carries Traffic
Not only has the content provider caused congestion on the service provider's backbone network, it has also caused the backbone network to carry the traffic a much greater distance, as reflected in the Path Metric column in Figure 8. The CDN should be contacted immediately to negotiate a new peering agreement.
After renegotiating the agreement, the network will now maintain service-level agreements (SLAs), even in the event of a fiber cut. Figure 10 shows the network's worst-case circuit analysis, with no interfaces increasing beyond 100 percent utilization.
Figure 10. Backbone Network After Renegotiating the Peering Agreement
In the near future, infrastructure software-defined networking (SDN) solutions will allow a quicker centralized resolution to traffic issues, particularly with the ability to orchestrate changes based on the global network view provided by Cisco MATE Design.
Service providers face several challenges regarding ingress traffic from CDNs. Cisco MATE Collector with Flow Collection and MATE Design make it easier for you to identify and manage unexpected ingress traffic from CDNs and develop equitable peering arrangements. Cisco MATE portfolio is the industry leader, with business solutions for profitability, planning, collection, and monitoring solutions for today's global networks.
1For more information, see the white paper Best Practices in Core Network Capacity Planning at http://www.cisco.com/