by Paul Veitch, Paul Hitchen, and Martin Mitchell, BT Innovate & Design
This article explores the architectural and operational challenges involved in integrating an existing standalone core Border Gateway Protocol (BGP)/Multiprotocol Label Switching (MPLS) VPN network onto a target Next-Generation Network (NGN). The rationale for consolidating and transforming multiple networks is explained, mainly in terms of potential cost savings and operational simplification achieved by the network operator. The article specifically focuses on the MPLS Carrier-supporting-Carrier (CsC) architectural framework, which allows the serving nodes of one MPLS VPN network to be interconnected through the serving nodes of another MPLS VPN network. The required architectural building blocks to implement CsC, the manner in which routing protocols must interact, as well as end-to-end packet flow and label encapsulation are all explained. The main design and operational challenges, including maintaining performance levels for customers, network resiliency, fault-handling, and capacity management, are also addressed in this article.
Network operators are under increasing pressure to deliver exceptional levels of customer experience and service while decreasing the capital and operational cost base of their networks. Many operators have traditionally built multiple network platforms, each of which has been uniquely designed to meet the requirements of specific services targeted at specific customer markets, such as voice, broadband IP, Virtual Private Networks (VPNs), etc.
In a bid to remain competitive and achieve cost reductions and operational simplifications, many operators have built all IP-based NGNs. The principal transformational benefits of an NGN with a single protocol such as IP at its heart include versatility in catering for multiple traffic requirements (for example, by employing IP Quality-of-Service [QoS] techniques), the ability to introduce novel and reusable services and features in a flexible manner, and the potential to maximise vendor interworking due to standards-based technology.
When a network operator builds an NGN, the challenge remains as to how to migrate existing networks and customers onto the new platform. The full commercial benefits of an NGN can be properly realised only after legacy networks are either consolidated or phased out completely. Many important factors must be considered, including the cost benefits, the potential effect on end customers, and the operational approach to carrying out migrations. These concerns must be weighed against the commercial and business risks associated with the alternative approach of sustaining and running multiple standalone platforms indefinitely.
This article focuses on a specific scenario: how to integrate an existing BGP/MPLS VPN network that provides VPN services to a corporate customer base with a "target" NGN. Following a brief overview of MPLS VPN services and networks, the rationale for consolidating multiple MPLS VPN networks is explained, mainly in terms of potential cost savings and operational simplification achieved by the network operator. The article then details the MPLS CsC architectural framework that allows the serving nodes or Points of Presence (POPs) of one MPLS VPN network to be interconnected to the serving nodes of another MPLS VPN network. The way in which routing protocols must interact and the subsequent effect on end-to-end packet forwarding across a CsC-enabled core network are explained. The principal design and operational challenges introduced by integrating core MPLS networks are then outlined, including maintaining performance levels, network resiliency, fault management, and capacity management.
The Business Case for MPLS VPN Network Consolidation
VPNs are an attractive solution to serve the enterprise networking requirements of a wide range of businesses from Small-to-Medium Enterprises (SMEs) to multinational "blue-chip" corporate organisations. Essentially, VPNs provide a transparent network infrastructure that allows multiple customer sites to communicate over a shared backbone network, as though they are using their own private network, regardless of geographical location. Typical applications that run across an organisation's VPN include corporate Intranet, mail services, and Voice-over-IP (VoIP) telephony.
Although distinct categories of VPN networking technology exist , this article focuses exclusively on "Layer 3" BGP/MPLS VPNs, as defined in RFC 4364  and other related Internet Drafts. Such networks have been deployed for more than 10 years and have seen significant growth during that period.
The critical core network elements of a provider-provisioned BGP/MPLS VPN network are Provider Edge (PE) and Provider Core (P) routers, as shown in Figure 1.
PE routers terminate customer access circuits, whereas P routers perform packet forwarding and typically do not have directly connected customer access circuits. PE routers perform label encapsulation and de-encapsulation, P routers run label switching, and both operate control-plane protocols that build MPLS Label Switched Paths (LSPs) from each PE to each other PE. Many protocols can be used to establish these LSPs; a commonly deployed approach uses the Label Distribution Protocol (LDP) in conjunction with an Interior Gateway Protocol (IGP), such as Open Shortest Path First (OSPF).
When a PE forwards a VPN-addressed packet across the core, it adds an inner MPLS label to identify the VPN of which the packet is a member and then an outer MPLS label to identify the egress PE router. Any intermediate P routers switch the packet to the egress PE using the outer label only. The egress PE uses the inner label to determine which VPN or port to forward the packet to.
The Customer Edge (CE) router is not considered part of the provider's core network. It acts as a peer of the PE router, but not a peer of other CE routers. Each PE router supports multiple routing and forwarding tables, called Virtual Route Forwarding (VRF) tables. VRF routes are logically separate, and they may contain IP prefixes received from the CE router that overlap with addresses in other VRFs. (For example, in Figure 1, VPN_A, site 1 has the same private routes as VPN_B, site 3.) VPNs are formed by defining individual customer accesses to be members of a specific VRF table, with several sites formed on one PE by defining all sites to use the same VRF table or allocating each site a VRF table and controlling connectivity through selective import and export of the IP routes of each VRF table.
The PE routers use an extended variant of BGP for signaling between themselves and propagating information about the actual routes of each VPN, as well as the inner MPLS label. The extended BGP, referred to as Multiprotocol BGP, carries each VPN route together with two new fields, the Route Distinguisher (RD) and the Route Target (RT), a form of extended BGP Community.
The RD is added to each VPN route to ensure that routes from different customers are unique; BGP treats VPN routes as equal only if both the RD and the IP prefix mask are equal. BGP uses RTs to indicate a group of routes, thus defining VPN membership information for exchange between PEs.
Maintenance Costs of BGP/MPLS VPN Networks
As detailed in the previous section, the main core components of a VPN network based on BGP/MPLS technology are the PE and P routers. Although not shown in detail in Figure 1, another critical element of a core VPN network is the Wide-Area Network (WAN) topology that interconnects the P (core) routers residing in specific service nodes, also called POPs. The WAN topology is essentially the way in which transmission links—typically Synchronous Optical Network (SONET)/Packet over SONET/SDH (PoS), Gigabit Ethernet, or 10 Gigabit Ethernet—are used to interconnect the POPs together.
It follows that maintenance costs associated with a self-contained MPLS VPN network will be incurred for PE and P routers, as well as the interconnecting WAN transmission links. These maintenance costs will split into capital and operational elements.
Capital expenditures are required on an ongoing basis for all IP router infrastructure (PE and P routers), for example, to upgrade hardware to meet increasing capacity demands, replace faulty line cards and processors, or replace end-of-life hardware with newer equipment. Capital expenditures are also needed on WAN links, for example, to replace faulty line cards and optics, as well as to deploy increased capacity transmission links to cater for traffic growth across the core network. Further capital costs accrue from accommodation-related aspects such as power, racking, and air conditioning.
Additional maintenance costs reside in the operational space. For example, if an MPLS VPN network has 40 POP locations, each with a pair of P (core) routers, the 80 core routers will consume a certain amount of operational team resources for critical maintenance, scheduled maintenance activities, and ongoing monitoring and reporting of router status (processors and line cards).
Benefits of Core Integration
If a network operator has deployed an IP-based NGN alongside an existing MPLS VPN network, the question should be asked: can the existing MPLS VPN network be integrated onto the NGN so as to avoid some or all of the previously stated maintenance costs? One approach would be to target the P (core) routers and WAN transmission links for eventual removal (Figure 2) and replacement by suitable connectivity of the MPLS VPN nodes to the NGN network. The VPN PE routers that often terminate large volumes of customer access circuits and host the rich service-related functions for corporate VPN services can essentially be left in situ, minimising the effect on end customers and confining the integration of networks to the inner part of the core infrastructure. The way in which this goal can actually be achieved in practice is detailed in the next section.
The main benefits that can be accrued for the network operator are as follows:
The combination of all these benefits can produce a compelling business case for network operators to consolidate core MPLS-based network platforms.
Carrier-supporting-Carrier (CsC) is a term used to describe a situation where one network, designated the customer carrier, is permitted to use a segment of another network, designated the backbone carrier . Although the term "Carrier of Carriers" is also used to describe the same architectural framework, this article uses Carrier-supporting-Carrier for consistency. In principle, the two "carrier" networks could belong to the same organisation, or could belong to two different organisations. Whatever the case, there is no reason why the backbone carrier cannot support multiple customer carrier networks. Furthermore, the customer carrier network itself can be either a BGP/MPLS VPN network providing Layer 3 VPN services or an Internet Service Provider (ISP) network .
A network operator with an existing BGP/MPLS VPN network infrastructure that has also built an IP-based NGN based on BGP/MPLS technology as per RFC 4364  could choose to exploit the CsC architectural framework to merge the two core networks. In such a scenario, the existing BGP/MPLS VPN network that serves the needs of VPN business customers would be viewed as the "customer carrier," whereas the NGN network would be positioned as the "backbone carrier."
Physical Connectivity and CsC VRF Creation
In order to integrate an existing BGP/MPLS VPN network such as that shown in Figure 2, with an NGN core belonging to the same or different organisation, the NGN network must be enabled to act as a backbone carrier. Assuming the NGN network is configured to support BGP/MPLS VPNs as per RFC 4364 , it comprises PE and P router core infrastructure. The PE routers of the NGN acting as the backbone carrier are denoted "CsC-PEs." The PE routers of the existing BGP/MPLS VPN network, that is, the customer carrier network that is being itself integrated with the NGN core, are denoted "CsC-CEs."
As shown in Figure 3, the NGN backbone carrier network provides MPLS VPN service to the customer carrier network using its own VRF table enabled on the CsC-PE. One important distinction between normal MPLS VPN service and CsC is the fact that traffic passed between the CsC-CE and CsC-PE is labeled rather than native IP [3, 4].
The CsC architecture is designed such that the backbone carrier network—the network provider's NGN network—needs to know only about internal routes within the customer carrier network. This setup allows formation of full "any-to-any" logical connectivity between the customer carrier routers, which in this scenario are the PE routers of the existing BGP/MPLS VPN network providing VPN services to end customers.
Furthermore, the backbone carrier routers themselves do not need to retain route prefix information for the end-customer VPNs connected to the customer carrier network because the end-customer traffic is transported over a second level of VRF tables that bear relevance only to the customer carrier itself, that is, the endpoint CsC-CEs. This nesting of MPLS VPN networks emphasises the inherent scalability of the CsC architecture. The CsC backbone carrier is effectively behaving like "proxy" P routers for the customer carrier network.
Figure 3 also shows the physical connectivity between the customer carrier network and backbone carrier NGN. Because many large-scale BGP/MPLS network deployments comprise large numbers of PE devices in the same service node or POP, there is often a Layer 2 Ethernet switch acting as an "intra-POP" aggregator. It is convenient to allow physical connectivity between the BGP/MPLS VPN service node and the CsC-PE in the NGN network using this aggregation switch. One or more Virtual LANs (VLANs) can be configured across this physical trunk to provide logical Layer 2 connectivity into the CsC-PE on the NGN, and be associated with the CsC VRF on that device. The Layer 2 switch also provides direct intra-POP connectivity between CsC-CEs present on the same VLANs.
Control-Plane Routing Protocols
The previous section described the physical connectivity between BGP/MPLS VPN service nodes and the target NGN, with creation of a specific VRF route on the CsC-PEs. This section addresses the way in which the internal routes of the CsC-CEs (that is, the PE routers belonging to the customer carrier BGP/MPLS VPN network) are advertised into this VRF table.
Optional routing protocols include the use of an IGP such as OSPF, or Exterior Gateway Protocols (EGPs) such as BGP. With an IGP like OSPF , the routing protocol itself is used for route exchange between the CsC-CEs and CsC-PEs, and must be used in conjunction with an LDP  for MPLS label exchange between the CsC-CEs and CsC-PEs.
Separating the IP prefix and label allocation protocols between an IGP and LDP can introduce complexities with potential divergence between the two control planes. Such divergence in the extreme case can lead to partial or complete loss in forwarding. Use of an EGP like BGP, however, can be used to implement CsC as a single IP prefix and Label Allocation control-plane protocol between CsC-CE and CsC-PE. Piggybacking MPLS label-mapping information in the BGP update messages helps ensure that an IP prefix and its associated MPLS label are always synchronised in their delivery. The way in which this synchronisation is achieved is documented in RFC 3107 . BGP has the benefit of being a mature protocol for use either within the same network organisation or between networks belonging to different operators. Furthermore, BGP employs mechanisms for loop avoidance and control over the number and type of routes advertised and accepted.
Figure 4 shows an example scenario whereby two BGP peerings are established (for resiliency) between each of the four CsC-CEs (which are actually PE routers of the BGP/MPLS VPN customer carrier network) and a pair of target CsC-PE routers (which are the PE routers of the NGN backbone carrier network).
Label Switching of Customer Packets
As shown in Figure 5, viewing packet flow from left to right, a unicast packet originates as a native IP packet when presented from the end client CE router to the MPLS VPN PE router, which is behaving as a CsC-CE in this context. Upon traversal between CsC-CEs in different MPLS VPN POP locations connected by an NGN backbone carrier using CsC, the packet ultimately undergoes three levels of label encapsulation:
As shown in Figure 5, the last P router in the backbone carrier path has "popped" the outermost label (label "D") using penultimate-hop label forwarding. The destination CsC-PE uses and removes the middle label (label "C") to indicate the correct outgoing interface, leaving only the innermost label on presentation to the CsC-CE (label "A"). This CsC-CE, which is the PE router in relation to the end VPN services, uses the last remaining label to determine the VRF route and interface on which to send the native IP packet so that it reaches the required client CE router.
Design and Operational Challenges
The previous section outlined the architectural framework of using CsC to integrate one BGP/MPLS core network with another. This section addresses the important design and operational challenges that such a network transformation brings about.
Maintaining Performance Levels
Many existing operators of "carrier-class" BGP/MPLS networks exploit IP QoS mechanisms to allow different IP-based traffic types to be treated in different ways in terms of how the packets are conveyed across the core network. This treatment relates chiefly to prioritisation of delay, jitter, and/or loss-sensitive traffic, against traffic types that are less sensitive to loss or delay. Customers of VPN services supported on such networks generally demand support of a range of traffic types, including corporate intranet, transactional applications, mail services, data backup, video, and VoIP telephony.
To deal with the range of traffic types, BGP/MPLS VPN service providers have developed the means of supporting IP QoS defining different transport classes with associated service levels. One such example may map, for instance, six service classes based on IETF "Per-Hop Behaviours" as defined by the Differentiated Services (DiffServ) working group[8, 9] and the recommended DiffServ Code Point (DSCP) values for them. The classes in this example could be broadly described as follows:
The DSCP markings dictate the way in which such traffic is placed into queues and conveyed across the core network. At the edge of the MPLS core, the PE maps the incoming DSCP value into the MPLS Class-of-Service (CoS) bits (formerly known as EXP bits).
The details of the mapping relate to the specific implementation and policy of the service provider. Under heavy traffic load and congestion situations, such policies dictate how packets are treated in terms of scheduling, queuing, and discard eligibility.
Both the existing BGP/MPLS "customer carrier" and the target NGN "backbone carrier" networks already have their own implementation of QoS classes to allow management and prioritisation of multiple traffic types carried across their respective core infrastructures. A significant design challenge that arises with integrating the networks is that a suitable mapping of the QoS schema present on the PE routers of the customer carrier network (the CsC-CEs in earlier diagrams) to the QoS schema supported on the PE routers of the NGN (the CsC-PEs in earlier diagrams) is necessary.
It is imperative that such a mapping not compromise the existing customer experience for VPN services in terms of packet loss, packet delay, and packet jitter (that is, delay variance). Careful design, mapping of the required service levels, and ultimately end-to-end testing of the QoS mappings is therefore necessary to assure the maintenance of performance levels after the networks are integrated with CsC.
As described earlier in the article and shown in Figure 2, an existing standalone BGP/MPLS network platform has interconnected POP locations using underlying core transmission infrastructures such as SONET/SDH/Dense Wavelength-Division Multiplexing (DWDM). The actual number of WAN circuits deployed, the use of transmission-layer protection mechanisms, and the overall topological connectivity between POPs determine overall levels of network resiliency. In turn, this aspect of the network architecture significantly affects the overall level of service availability to end customers of VPN services.
When the standalone BGP/MPLS network has its existing core topology replaced with that of the NGN backbone carrier, it is very important to consider the levels of resiliency delivered with the new integrated core architecture, compared with the existing standalone arrangement. Critical considerations include:
All these aspects should be assessed and incorporated into the actual design process such that there is no detrimental effect on overall levels of service availability to the end customer. Service levels can be verified by reliability modeling of the new network topology, and by comparing the results with the reliability data for the existing topology.
There are many facets of monitoring and managing a core BGP/MPLS network in terms of assurance of service, alarm detection and filtering, customer notification of faults, and so on. In a standalone network environment, it is generally the responsibility of a particular operational team to manage faults on the network and provide service continuity during various types of failure scenarios. As shown in Figure 6, this operational function usually covers all core network elements, including PE and P (core) routers, as well as the WAN topology interconnecting the service nodes or "POPs."
In an integrated core network scenario, however, part of the customer carrier network—the P (core) routers and WAN transmission links, for example—are replaced by the NGN backbone carrier. The NGN backbone carrier has its own operational team with specific processes and systems for carrying out monitoring and management of fault events. A crucial challenge arises in terms of how to realise end-to-end fault management holistically and transparently between customer carrier and backbone carrier networks (Figure 6). Important considerations include:
These topics must be factored in to determine the optimal solution for realising smooth and transparent fault-management procedures in an integrated core BGP/MPLS network environment.
As shown in Figure 6, in a standalone BGP/MPLS VPN network environment, a particular operational function exists for ongoing core capacity planning to ensure P router and WAN link capacity are suitably dimensioned to cope with current and future traffic demands. When an existing BGP/MPLS VPN network becomes a customer carrier network that is integrated with a target NGN backbone using CsC, there will be a corresponding shift in responsibility for certain aspects of core capacity planning.
VPN service traffic that would have been confined to its own dedicated core network will now be offered onto the NGN backbone carrier core network. As such, the capacity-management function for the NGN backbone carrier must use traffic planning information pertaining to the VPN services in addition to all the other service types supported on the NGN. This aggregated view of traffic demands will accelerate the core capacity dimensioning on the NGN backbone carrier network.
Figure 6: Fault-Management and Capacity-Planning Functions
The MPLS-based Carrier-supporting-Carrier (CsC) framework provides network operators with a potential solution for integrating an existing BGP/MPLS VPN network, with a target all-IP based NGN. This solution should enable both capital and operational cost reduction by collapsing multiple core networks into a single NGN core domain. The article emphasised that as well as understanding the critical network architectural building blocks required to implement CsC, there are numerous critical design and operational challenges that an integrated core network presents. These challenges include how to maintain service levels and performance metrics for existing VPN customers, resiliency, fault management, and capacity planning. It is important to note, however, that in addition to the broad topic areas covered in this article, many specific additional challenges will present themselves to network operators who have implemented BGP/MPLS VPN networks, and/or NGN networks in their own specific way.