The Internet Protocol Journal, Volume 14, No.1

Transitioning Protocols

Geoff Huston, APNIC

In the previous article, I looked at some common myths associated with the transition to IPv6. In this article I would like to look behind the various opinions and perspectives about this transition, and examine in a little more detail the nature of the technologies being proposed to support the transition to IPv6.

After some time of hearing dire warnings about the imminent ex-haustion of the stocks of available IPv4 address space, we have now achieved the first milestone of address exhaustion, the depletion of the central pool of Internet Assigned Numbers Authority (IANA)-managed address space. The last five /8s were handed out from IANA to the Regional Internet Registries (RIRs) on February 3, 2011. After some years of industrywide general inattention and inaction with IPv6, perhaps it is not unexpected to now see a panicked response along the lines of "Maybe we should do something now!"

But what exactly should be done? It is one thing to decide to "support" IPv6 in a network, but quite another to develop a specific plan, complete with specific technologies, timelines, costs, vendors, and a realistic assessment of the incremental risks and opportunities. Although working through some of this detail has the normal levels of uncertainty that you would expect to see in any environment that is undergoing constant change and evolution, an additional level of uncertainty here is a by-product of the technology itself.

There is not just one approach to adding support for IPv6 in your network, but many. And it is not just one major objective you need to address—incremental deployment of IPv6 as a second protocol into your operational network without causing undue disruption to existing services—but two, because the second challenging objective is how to fuel continued growth in your network service platform when the current supply lines of readily available IPv4 addresses are effectively exhausted.

When?

The most common question I have heard recently is: "How long do we have?"

The remaining pools of IPv4 address space continue to be drawn down. At the start of February 2011, the IANA pool was fully depleted, with the final allocation to the RIRs [1] of IPv4 addresses.

Using a model based on monthly address demands now predicts that the next 18 months or so will see the first three RIRs depleted of IPv4 addresses.

The Asia Pacific Network Information Centre (APNIC) was the first RIR to exhaust its available pool of IPv4 addresses in April 2011, with the RIPE Network Coordination Centre (RIPE NCC) predicted to follow in late 2011 and the American Registry for Internet Numbers (ARIN) in early 2012. The Latin American and Caribbean Internet Addresses Registry (LACNIC) is predicted to follow in 2014, and the African Network Information Centre (AFRINIC) in 2016.

The good news is that many people have been busy thinking about these intertwined objectives of extending the useful lifetime of IPv4 in the Internet and simultaneously undertaking the IPv6 transition, and there is a wealth of possible measures you can take, and a broad collection of technologies you can use. Fortunately, we are indeed spoiled with choices here!

The not-so-good news is that there is no simple single path to follow. Each individual network needs to carefully consider the transition and select an approach that matches their particular circumstances. For an industry used to playing "follow the leader" for many years, a variety of choice is not always appreciated. And, unfortunately, we are spoiled for choices here.

Let's look at each of the major transitional technologies that are currently in vogue, and examine their respective strengths and weaknesses and their intended area of applicability. We will look at these technologies first from the perspective of the end user and then from the other side, examining options for Internet Service Providers (ISPs).

The Dual-Stack ISP Client

If your service provider provides a dual-stack service with both IPv6 and IPv4, then your task should be relatively straightforward. If you configure your modem or router with IPv6 in addition to IPv4, you are finished, assuming of course that your local modem or router unit actually supports IPv6—an assumption that may not be valid in many of the older and, unfortunately, many of the currently available devices.

The conventional approach in this form of environment is to use IPv6 Prefix Delegation, where the ISP provides the client with an IPv6 prefix, usually a /48 or a /56 IPv6 address prefix, which is then passed into the client network through an IPv6 Router Advertisement. Local hosts should be constructed to configure their IPv6 stack automatically, and your system should be connected as a dual-protocol system.

You probably do, however, need to be aware of some caveats, of which the most important is likely to relate to the probable absence of a Network Address Translation (NAT) [2] function in IPv6. Currently most commercial IPv4 Internet services assign a single IP address to each client.

To allow this address to be shared within the client's network, most IPv4 "edge" devices autoconfigure themselves as NAT devices, permitting outgoing connections using the Transmission Control Protocol (TCP) or User Datagram Protocol (UDP), and allowing some Internet Control Message Protocol (ICMP) message types to traverse the NAT, but not much else. For many clients this NAT configuration becomes the default local security framework, because it permits outbound connections through TCP and UDP to be made, but not much else, and permits initiation of no sessions as incoming sessions. With IPv6 the local network is generally configured with an entire subnet, and instead of a NAT, this subnet is directly connected to the Internet.

The local network is then in a mixed situation of being behind a NAT in IPv4, but directly connected to the Internet using IPv6. This asymmetric configuration with respect to IPv4 and IPv6 raises some questions about the effect on the security of your local network. You need to think about adding appropriate filter rules to the gateway IPv6 configuration that performs the same level of access control to your local site that you have already set up with IPv4 and the NAT. The best advice here is to configure some filter rules for IPv6 that limit the extent of exposure of your internal network to the broader Internet to be directly comparable to the configuration you are using with IPv4.

The IPv4-Only ISP Client

Even today, when the IPv4 pools are rapidly depleting, it is really not very common to have an ISP offering dual-stack IPv4 and IPv6 services. Let's look at the more common situation, when your ISP is still offering only IPv4. As an end user, can you still set up some form of IPv6 access?

The answer is "Yes," but you must use tunnels, and the story can get somewhat ugly.

6to4 Tunnels

If you have public IPv4 addresses on your local network, you may elect to configure your local system to use the 6to4 Tunneling Protocol.

6to4 is an autotunneling protocol coupled with an addressing structure. The IPv6 address of a 6to4-reachable host begins with the IPv6 prefix 2002::/16. The address architecture embeds a 32-bit IPv4 address of the end host into the next 32 bits. That way the IPv6 address carries the "equivalent" IPv4 address within the IPv6 address.

To send an IPv6 packet, the local host must first tunnel through the local IPv4 network. To perform this tunneling, the local host encapsulates the IPv6 packet in an outer IPv4 packet header. The IP protocol used is neither TCP nor UDP, but protocol 41, an IP protocol number reserved for tunneling IPv6 packets (RFC 2473) [3].

The IPv4 packet is addressed to an IPv4-to-IPv6 relay. To avoid manual configuration of each client, all these relays share the same anycast address, 192.88.99.1. These relays strip the outer IPv4 packet header off the packet and forward the IPv6 packet into the IPv6 network. The IPv6 destination treats the packet normally, and generates a packet in response without any special processing.

The reverse path to a 6to4 host uses an IPv6-to-IPv4 relay. The IPv6 address of the 6to4 local host started with the IPv6 address prefix ::/16, so the IPv6 packet that is being sent back to this host has a destination address that uses the 2002::/16 6to4 prefix. This prefix is interpreted as an anycast relay address. A route to the IPv6 2002::/16 prefix is advertised by IPv6-to-IPv4 relays. When a relay receives a packet destined to a 2002::/16 address, it lifts the IPv4 address from inside the IPv6 address. It then wraps the IPv6 packet in an IPv4 packet header, using as a destination address this extracted IPv4 address, and using protocol 41 as the IP protocol. The resultant IPv4 packet is then passed to the 6to4 host in the IPv4 network (Figure 1).

Figure 1: 6to4 Tunneling Architecture

If the local network has public IPv4 addresses on the local network, then individual hosts on the local network may use 6to4 directly. Of course then the local gateway needs to be configured to accept incoming IP packets that use protocol 41.

An alternative is to configure the gateway device of the local network as a 6to4 gateway, and use the IPv4 address on the ISP side of the gateway as a common 6to4 address for the local network. The gateway then advertises this synthetic 48-bit IPv6 prefix to the interior network with a conventional IPv6 Router Advertisement. The gateway can couple this advertisement with a NAT function and provide native IPv6 to interior hosts that are configured on RFC 1918 [4] local IPv4 addresses.

In general, 6to4 is a relatively poor approach to provisioning IPv6, and you really should avoid it if at all possible. Indeed, your experience will probably be better overall if you continue running IPv4 and avoid accessing IPv6 with 6to4!

The major concern here is that a successful connection relies on the assistance of both an outbound and an inbound 6to4 third-party relay. On the IPv4 side a 6to4 connection relies on the presence of a usable route to a IPv4-to-IPv6 relay, and preferably one that is as close as possible to the IPv4 endpoint. On the IPv6 side a 6to4 connection relies on a usable relay advertising a route to 2002::/16. Again, to avoid extended path overheads, this relay should be as close as possible to the IPv6 endpoint. This path asymmetry can cause connection "black holes," where one party can deliver packets to the other but not the reverse.

Also, such configurations have problems if the IPv4 host is configured with stateful filters that insist that the IPv4 source address in incoming packets match the destination address of outgoing packets, not necessarily true in a 6to4 connection.

Finally, it seems that many sites operate with firewall filters that disallow incoming packets other than TCP and UDP (and possibly some forms of ICMP). The 6to4 packets use protocol 41, and there appears to be widespread use of filter rules that block such packets.

Tunneling also adds an additional packet header to a packet, inflating the size of the packet. Such an expansion of the packet on certain path elements of the network may cause path packet size problems, increasing the risk of encountering Path Maximum Transmission Unit (MTU) "black holes" due to the increase of the packet size by 20 bytes when the IPv4 packet header is attached to the packet.

Teredo Tunnels

If the local network is behind an IPv4 NAT and the NAT gateway does not support 6to4, then all is not lost, because another form of tunneling could possibly be an answer. Teredo is described in RFC 4380 [5].

Teredo, like 6to4, is an autotunneling protocol coupled with an addressing structure. Like 6to4, Teredo uses its own address prefix, and all Teredo addresses share a common IPv6 /32 address prefix, namely 2001:0000::/32. The next 32 bits are the IPv4 address of the Teredo server. The IPv6 interface identifier field is used to support NAT traversal, and it is encoded with the triplet of a field describing the NAT type, the view of the relay of the UDP port number used to reach the client (the external UDP port number used by the NAT binding for the client), and the view of the relay of the IPv4 address used to reach the client (the external IPv4 address used by the NAT binding for the client).

Teredo uses what has become a relatively conventional approach to NAT traversal, using a simplified version of the Session Traversal Utilities for NAT (STUN) [6] active probing approach to determine the type of NAT; it uses concepts of "clients," "servers," and "relays."

A Teredo client is a dual-stack host that is located in the IPv4 world, assumed to be located behind a NAT. A Teredo server is an address and reachability broker that is located in the public IPv4 Internet, and a Teredo relay is a Teredo tunnel endpoint that connects Teredo clients to the IPv6 network. The tunneling protocol used by Teredo is not the simple IPv6-in-IPv4 protocol 41 used by 6to4. NAT devices are sensitive to the transport protocol and generally pass only TCP and UDP transport protocols. In the Teredo case the tunneling is UDP, so all IPv6 Teredo packets are composed of an IPv4 packet header and a UDP transport header, followed by the IPv6 packet as the UDP payload. Teredo uses a combination of ICMPv6 [7] message exchanges to set up a connection and tunneled packets encapsulated using an outer IPv4 header and a UDP header, and it contains the IPv6 packet as a UDP payload.

It should be noted that this reliance on ICMPv6 to complete an initial protocol exchange and confirm that the appropriate NAT bindings have been set up is not a conventional feature of IPv4 or even IPv6, and IPv6 firewalls that routinely discard ICMP messages will disrupt communications with Teredo clients.

Figure 2: Teredo Tunneling

The exact nature of the packet exchange in setting up a Teredo connection depends on the nature of the NAT device that sits in front of the Teredo client. Figure 2 shows an example packet exchange that Teredo uses when the client is behind a Restricted NAT.

Teredo represents a different set of design trade-offs as compared to 6to4. In its desire to be useful in an environment that includes NAT functions in the IPv4 path, Teredo is a per-host connectivity approach, as compared to the 6to4 approach, which can support both individual hosts and entire end sites within the same technology. Also, Teredo is a host-centric multiparty rendezvous application, and Teredo clients require the existence of dual-stack Teredo servers and relays that exist in both the public IPv4 and IPv6 networks. Teredo is more of a connectivity tool than a service solution, and one that is prone to many forms of operational failure.

On the other hand, if you are an isolated IPv6 host behind an IPv4 NAT and you want to access the IPv6 network, then 6to4 is not an option, and you either have to set up static tunnels across the NAT to make it all work or turn on Teredo in your dual-stack host; if everything goes according to theory, you should be able to establish IPv6 connectivity. It is highly likely that the IPv6 Teredo connection will fail in strange ways, and, like 6to4, this is a technology best avoided!

Tunnel Brokers

In contrast to these autotunnel approaches, the simplest form of tunneling IPv6 packets over an IPv4 network is the manually configured IPv6-in-IPv4 tunnel.

Here an IPv6 packet is simply prefixed by a 20-octet IPv4 packet header. In the outer IPv4 packet header, the source address is the IPv4 address of the tunnel ingress, the destination address is the IPv4 address of the tunnel egress, and the IP protocol field uses value 41, indicating that the payload is an IPv6 packet. The packet is passed across the IPv4 network from tunnel ingress to egress using conventional IPv4 packet forwarding, and at the egress point the IPv4 IP packet header is removed and the inner IPv6 packet is routed in an IPv6 network as before. From the IPv6 perspective the transit across the IPv4 network is a single logical hop.

Alternatively, like Virtual Private Network (VPN) tunnels, the tunnel can be configured using UDP or TCP, and with some care, the tunnel can be configured through NAT functions in the same way as VPN tunnels can be configured through NAT functions.

The advantage of this approach is that the need to manually configure the tunnel endpoints ensures that the tunnel relay function is not provided, intentionally or unintentionally, by third parties through some well-intentioned, but ultimately random, act of goodwill. The need to perform a manual configuration also reduces the chances that the tunnel will be broken through local firewall filters.

Of course the need to perform a manual configuration does not lend itself to a "plug-and-play" environment, nor is this approach a viable one for a larger mass market of consumer devices and services.

Client Conclusions

None of these approaches to offer IPv6 connectivity to end hosts behind an IPv4-only service provider offers the same level of robustness and performance as native IPv4 services. All of these approaches require a significant degree of local expertise to set up and maintain, and they often require a solid understanding of other aspects of the local environment, such as firewall and filter conditions and Path MTU behavior to maintain. With the exception of the tunnel broker approach, they also require third-party assistance to support the connection, further adding to the set of potential performance and reliability concerns.

It appears that the most robust and reliable way to provision IPv6 to end hosts is for the service provider to provision IPv6 as an integral part of its service offering, and offer clients a dual-stack service in both IPv4 and IPv6.

IPv6 for Internet Service Providers

Although the "self-help" autotunneling approaches for clients outlined earlier in this article are a possible answer, their utility is appropriately restricted to a very small number of end clients who have the necessary technical expertise and who are willing to debug some rather strange resultant potential problems relating to asymmetric paths, third-party relays, potential MTU mismatches, and interactions with filters. This approach is not a reasonable one for the larger Internet.

From the perspective of the mass market for Internet Services, we cannot assume that clients have the motivation, expertise, and means to bypass their ISP and set up IPv6 access on their own, either through autotunneling or manually configured tunnels. The inference from this observation is that for as long as the mass-market ISPs do not commit to IPv6 services, and for as long as they continue to stall in deploying services supporting dual access for their clients, the entire IPv6 transition story remains effectively stalled.

How can ISPs support IPv6 access for their clients?

The Dual-Stack Service Network

Perhaps it is obvious, but the most direct response here is for the ISP to operate a Dual-Stack Network.

And the most direct way to achieve this operation is for the ISP's infrastructure to also support IPv6 wherever there is IPv4, so that the delivery of services to the ISP's clients in IPv6 faithfully replicates the service offered in IPv4.

This solution implies that the network needs to support IPv6 in the ISP's routing infrastructure, in the network data plane, in the load-management systems, in the operational support infrastructure, in access and accounting, and in peering and in transit. In short, wherever there is IPv4 there needs to be IPv6.

The infrastructure elements that require dual-stack service at the next level include the routing and switching elements, including the internal and external routing protocols. The task includes negotiating peering and transit services in IPv6 to complement those in IPv4. Network infrastructure also includes VPN support and other forms of tunnels, as well as data center front-end units, including load balancers, filters and firewalls, and various virtualized forms of service provision. The task also includes integration of IPv6 in the network management subsystem and the related network measurement and reporting system. Even a comprehensive audit of the supported Management Information Bases (MIBs) in the active elements of the network to ensure that the relevant IPv6 MIBs are supported is an essential task. A similar task is associated with equipping the server infrastructure with IPv6 support, and at the higher levels of the protocol stack are the various applications, including web services, mail, Domain Name System (DNS), authentication and accounting, Voice over IP (VoIP) servers, Load Balancers, Cloud Servers, and similar applications.

And those are just the common elements of most ISPs' infrastructures. Every ISP also has more specialized elements in its service portfolio, and each one of these elements also requires a comprehensive audit to ensure that there is an IPv6 solution for each of these elements that leads to a comprehensive dual-stack outcome.

As obvious as this approach might appear, it has two significant problems. First, it requires a comprehensive overhaul of every element in the ISP's service network. Even for small-scale ISPs this overhaul is not trivial, and for larger service provider platforms it is an exercise that may take months if not years and make considerable inroads into the operating budgets of the ISPs. Secondly, it still does not account for the inevitable fact that in the coming months the current supply lines of IPv4 addresses will end and any continued expansion of the service platform will require some different approaches to the way in which IPv4 addresses are deployed in the service platform.

Although the approach of simply provisioning IPv6 alongside IPv4 in a simple dual-protocol service infrastructure may appear to be the most obvious response to the need to transition to IPv6, it may not necessarily be the most appropriate response for many ISPs to the dual factors of IPv6 transition and IPv4 address exhaustion.

Are there alternative approaches for ISPs? Of course.

Hybrid Approaches

Saying that an ISP must deploy IPv6 across all of its infrastructure and actually doing it are often quite different. The cost of converting all parts of an ISP's operation to run in dual-stack mode can be quite high, and the benefit of running every aspect of an ISP's service offering in dual-stack mode is dubious at best.

Are there middle positions here? Is it possible for an ISP to deliver robust IPv6 services to clients while still operating an IPv4-only internal network? One way to look at an ISP's network is as a transit conduit (Figure 3).

Figure 3: Generic ISP Packet Transit Architecture

The ISP needs to be able to accept packets from an external interface, determine the appropriate egress point for the packet within the context of the local network, and then ensure that the packet is passed out this egress interface. The internal network need not operate in the same protocol context as the protocol of the packets the network is handling. Viewed at a level of the minimal essentials, the network needs to be able to have some protocol-specific capability at its ingress points in order to determine the appropriate egress point of each incoming packet, and thereafter during the transit of the service provider’s network, the minimum necessary association to maintain the identity of this preselected egress point with the packet. Now if the network uniformly supports the same protocol as the packet, then the same egress decision can be made at each forwarding point within the network.

Alternatively, the packet can be encapsulated with an outer wrapper that identifies the egress point using the same protocol context as that used by the service provider's internal switching elements, and the packet can be passed through the service provider's transit network using only this temporary wrapper to determine the sequence of forwarding decisions. Multiprotocol Label Switching (MPLS) net-works are an excellent example of this form of approach, as are other forms of IP-in-IP encapsulation. The advantage of this approach is that the internal infrastructure of the service provider network need not be altered to support additional carriage protocols: the changes to specifically support IPv6 are required only at the network ingress elements, and a basic encapsulation stripping function is used at all egress points.

With this information in mind, let's look at some of these hybrid approaches to supporting IPv6 in a service provider network.

6RD

6RD, described in RFC 5969 [8], is an interesting refinement of the 6to4 approach. It shares the same basic encapsulation protocol and the same address structure of embedding of the IPv4 tunnel endpoint into the IPv6 address. However, it has removed the concept of third-party relays and the use of the common 2002::/16 IPv6 prefix, and instead uses the provider's IPv6 prefix. The effect of these changes is to limit the scope of the tunneling mechanism to that of tunneling across the network infrastructure of a single provider, and the intended function is to tunnel from the Customer Premises Equipment (CPE) to IPv6 Border Relays operated by the customer's ISP (Figure 4).

Figure 4: 6RD Tunneling

If 6to4 is not recommended for use because of high failure rates of connections and suboptimal performance, then why would 6RD be any better?

The most compelling reason to believe that 6RD will perform more reliably than 6to4 is that 6RD removes the wild-card third-party relay element from the picture. For outbound traffic the CPE provides the tunnel encapsulation, which is, hopefully, under the ISP's operational control. The IPv6-in-IPv4 tunnel is directed to the ISP's own 6RD Border Relay rather than the 6to4 relay anycast address. Because this process is also under the ISP's direct operational control, it eliminates the outbound third-party relay function. For the reverse path, the use of the provider’s own IPv6 prefix in 6RD, instead of the generic 2002::/16 prefix, ensures that the inbound packets are sent through IPv6 directly to the ISP, and the IPv6-in-IPv4 tunnel is again limited to a hop across the ISP's own internal infrastructure.

As long as the ISP effectively manages all CPE devices, and as long as the CPE itself is capable of supporting the configuration of additional functional modules that can deliver unicast IPv6 to the client and 6RD tunnels inward to the ISP, then 6RD is a viable option for the ISP. At the cost of upgrading the CPE set to include 6RD support, and the cost of deployment of 6RD Border Relays that terminate these CPE tunnels, together with IPv6 transit from these Border Relays, the ISP is in a position to provide dual-stack support to its client base from an internal network platform that remains an IPv4 service platform, thereby deferring the process of conversion of its entire network infrastructure base to support IPv6.

For ISPs seeking to defray the internal infrastructure IPv6 conversion costs over a number of years, or for ISPs seeking an incremental path to IPv6 support that allows the existing infrastructure to remain in place temporarily, 6RD can be an interesting and cost-effective alternative to a comprehensive dual-stack deployment, as long as the ISP has some mechanism to load the CPE with IPv6 support and 6RD relay functions.

MPLS and 6PE

The 6RD approach has many similarities to MPLS, in that an additional header is added to incoming packets at the network boundary, and the encapsulation effectively directs the packet to the appropriate network egress point (as identified by ingress), where the encapsulation is stripped and the original packet is passed out.

Rather than using an IPv4 header to direct a packet from ingress to egress, if the network is already using MPLS, why not simply support IPv6 on an existing MPLS network as a PE-to-PE MPLS path set and bypass the IPv4 step?

Why not, indeed, and RFC 4659 [9] describes how this bypass can be achieved.

If you are running an MPLS network, then the role of the interior routing protocol and label distribution function is to maintain viable paths between all network ingress and egress points. The protocol-specific function in such networks is not the interior network topology management function, but the maintenance of the mapping of egress to protocol-specific destination addresses (Figure 5).

Figure 5: MPLS and 6PE

As with 6RD, if the local problem is some form of prohibitive barrier to the immediate deployment of IPv6 in a dual-stack configuration across the network infrastructure, then this approach allows an IPv4 MPLS network to set up paths across the network IPv4 MPLS infrastructure from provider edge to provider edge. These paths may be used to tunnel IPv6 packets across the network by associating the IPv6 destination address of the incoming packet with the IPv4 address of the egress router, using the interior Border Gateway Protocol (iBGP) Next-Hop address, for example.

The incremental changes to support IPv6 are constrained to adding IPv6 to the service provider’s iBGP routing infrastructure, and to the provider-edge devices in the MPLS network, while all other parts of the service provider's service platform can continue to operate as an MPLS IPv4 network for now.

IPv4 Address Compression

It is not just the challenge of adding a new protocol to the existing IPv4 network infrastructure that confronts ISPs. The entire reason for this activity is the prospect of exhaustion of supply of IPv4 addresses. When this prospect was first aired, in 1990, it was assumed that the Internet would be supported by industry players that acted rationally in terms of common interests.

One of the more critical assumptions made in the development of transitional tools was that transition activity would be undertaken well in advance of IPv4 address exhaustion. Competitive interest would see each actor making the necessary investments in new technologies to mitigate the risks of attempting to operate a network in an environment of acute general scarcity of addresses. As much fun as the debate as to whom the "last" IPv4 address should be given might be, it was assumed that this event was, in fact, never going to happen. The assumption was that industry actors would anticipate this situation and take the necessary steps to avoid it. The transition to IPv6 would be effectively complete well before the stocks of IPv4 addresses had been exhausted, and IPv4 addresses would be an historical artefact well before we needed to use the last one!

Obviously, this scenario has not happened.

This industry is going to exhaust the available supplies of IPv4 addresses well before the transition to IPv6 is complete—and in some cases well before the transition process has even commenced! This situation creates an additional challenge for ISPs and the Internet, and raises a further question as well. The challenge is to fold into this dual-stack transition the additional factor of having to work with fewer and fewer IPv4 addresses as the transition process continues. This situation implies that the necessary steps that the ISP must take include ones that increase the intensity of use of each IPv4 address, and wherever possible substitute a private-use IPv4 address for public IPv4 addresses.

The question that this scenario raises is one of guessing how long this hybrid model of an Internet where a significant proportion of network services and network clients remains entrenched in an IPv4-only world will persist. For as long as such IPv4-only network domains persist, and for as long as these IPv4-only network domains encompass significant service and customer populations, all the other parts of the Internet are forced to maintain residual IPv4 capability and cannot transition their customers and services to an IPv6-only environment. Students of economic game theory may see some rich areas of study in this developing situation.

More practically, for an ISP the question becomes one of attempting to understand how long this hybrid period of attempting to operate a dual-stack network with continuing postexhaustion demand for further IPv4 addresses will last. Will an after-market for the redistribution of addresses emerge? How will the increasing scarcity pressure affect pricing in such a market? How long will demand persist for IPv4 addresses in the face of escalating prices? Will the industry turn to IPv6 in a rapid surge in response to cost escalation for additional IPv4 addresses, or will a dual-stack transition lumber on for many years? In such a large, diverse, heterogeneous environ- ment of today's Internet, the one constant factor is that the immediate future of the Internet is clouded with extremely high levels of uncertainty.

The cumulative effect of the individual decisions made by service providers, enterprises, carriers, vendors, policy makers, and consumers has create>d a somewhat chaotic environment that adds a significant level of uncertainty and associated investment risk into the current planning process for ISPs.

Carrier-Grade NATs

I have often heard it said that address scarcity in IPv4 is nothing new, and it first occurred when the first NAT device that supported port mapping was deployed. At this point the concept of address sharing was introduced to the Internet, and, from the perspective of the NAT industry, we have not looked back since.

In today's world NATs are extremely commonplace. Most clients are provisioned with a single address from their ISP, which they then share across their local network using a NAT. Whether it is well advised or not, NATs typically form part of a client's network security framework, and they often are an integral part of a customer’s multihoming configuration if the client uses multiple providers.

But in this model of NATs as the CPE, the ISP uses one IPv4 address for each client. If the ISP wants to achieve greater levels of address compression, then it is necessary to share a single IPv4 address across multiple customers.

The most direct way to achieve this scenario is for ISPs to operate their own NAT, variously termed a Carrier-Grade NAT (CGN) or a Large-Scale NAT (LSN), or NAT444. This approach is the simplest, and, in essence, is a case of "more of the same" (Figure 6).

Figure 6: Carrier-Grade NATs

The Carrier-Grade NAT allows a single public address to be shared across multiple clients, who, in turn, further share this address across the end systems in their local networks.

From behind the CPE in the client edge network not much has changed with the addition of the CGN in terms of application behavior. It still requires an outbound packet to trigger a binding that would allow a return packet through to the internal destination, so nothing has changed there. Other aspects of NAT behavior, notably the NAT binding lifetime and the form of NAT "cone behavior" for UDP, take on the more restrictive of the two NAT functions in sequence. The binding times are potentially problematic in that the two NATs are not synchronized in terms of binding behavior. If the CGN has a shorter binding time, it is possible for the CGN to misdirect packets and cause application-level problems. However, this situation is not overly different from a single-level NAT environment where aggressively short NAT binding times also run the risk of causing application-level problems when the NAT drops the binding for an active session that has been quiet for an extended period of time.

However, one major assumption is broken in this structure, namely that an IP address is associated with a single customer. In the CGN model a single public IP address may be used simultaneously by many customers at once, albeit on different port numbers. This scenario has obvious implications in terms of some current practices in filters, firewalls, "black" and "white" lists, and some forms of application-level security and credentials where the application makes an inference about the identity and associated level of trust in the remote party based on the remote party’s IP address.

This approach is not without its potential operational problems as well. For the service provider, service resiliency becomes a critical concern in so far as moving traffic from one NAT-connected external service to another will cause all the current sessions to be dropped. Another concern is one of resource management in the face of potentially hostile applications. For example, an end host infected with a virus may generate a large amount of probe packets to a large range of addresses. In the case of a single edge NAT, the large volumes of bindings generated by this behavior become a local resource-management problem because the customer's network is the only affected site. In the case where a CGN is deployed, the same behavior will consume port-binding space on the CGN and, potentially, can starve the CGN of external address port bindings. If this problem is seen to be significant, the CGN would need to have some form of external address rationing per internal client in order to ensure that the entire external address pool is not consumed by a single errant customer application.

The other concern here is one of scalability. Whereas the most effective use of the CGN in terms of efficiency of usage of external addresses occurs when the greatest numbers of internal edge NATed clients are connected, there are some real limitations in terms of NAT performance and address availability when a service provider wants to apply this approach to networks where the customer population is in the millions or larger. In this case the service provider must use an IPv4 private address pool to number every client. But if network 10 is already used by each customer as its "internal" network, then what address pool can be used for the service provider’s private address space? One of the few answers that come to mind is to deliberately partition the network into numerous discrete networks, each of which can be privately numbered from 172.16.0.0/12, allowing for some 600,000 or so customers per network partition, and then use a transit network to "glue" together the partitioned elements.

The advantage of the CGN approach is that nothing changes for the customer. There is no need for any customers to upgrade their NAT equipment or change it in any way, and for many service providers this motivation is probably sufficient to choose this path. The disadvantages of this approach lie in the scaling properties when looking at very large deployments, and the concerns of application-level translation, where the NAT attempts to be "helpful" by performing Deep Packet Inspection and rewriting what it thinks are IP addresses found in packet payloads. Having one NAT do this process is bad enough, but loading them up in sequence is a recipe for trouble.

Are there alternatives?

The Address-plus-Port Approach

One NAT in the path is certainly worse than none from the perspective of application agility and functions. And two NAT functions do not make it any better! Inevitably, that second NAT device adds some additional levels of complexity and fragility into the process.

The question is, can these two NAT functions be collapsed back into a single NAT, yet still allow sharing of public IPv4 addresses across multiple end clients? CPE NAT devices currently map connections into the 16-bit port field of the single external address. If the CPE NAT could be coerced into performing this mapping into, say, 15 bits of the port field, then the external address could be shared between two edge CPEs, with the leading bit of the port field denoting which CPE. Obviously, moving the bit marker further across the port field will allow more CPE devices to share the one address, but it will reduce the number of available ports for each CPE in the process.

The theory is again quite simple. The CPE NAT is dynamically configured with an external address, as happens today, and a port range, which is the additional constraint. The CPE NAT performs the same function as before, but it is now limited in terms of the range of external port values it can use in its NAT bindings to those that lie within the provided port range. Other CPE devices are concurrently using the same external IP address, but with a different port range.

For outgoing packets this scenario implies only a minor change to the network architecture, in that the RADIUS exchange to configure the CPE now must also provide a port range to the CPE device. The CPE is then constrained such that as it maps private addresses and TCP or UDP port values to the external address and port values, the mapped port value must fall within the configured range.

The handling of incoming packets is more challenging. Here the service provider must forward the packet based not only on the destination IP address, but also on the port value in the TCP or UDP header, because there are now multiple CPE egress points that share the same IP address. A convenient way to perform forwarding is to take the Dual-Stack Lite approach and use an IPv4-in-IPv6 tunnel between the CPE and the external address-plus-port (A+P) gateway. This address-plus-port gateway needs to be able to associate each address and port range with the IPv6 address of a CPE (which it can learn dynamically as it decapsulates outgoing packets that are similarly tunneled from the CPE to the address-plus-port gateway). Incoming packets are encapsulated in IPv6 using the IPv6 destination address that it has learned previously. In this manner the NAT function is performed just once, at the edge, much as it is today, and the interior device is a more conventional form of tunnel server (Figure 7).

Figure 7: Address-plus-Port-Approach

This approach relies on every CPE device being able to operate using a restricted port range, to perform IPv4-in-IPv6 tunnel ingress and egress functions, and act as an IPv6 provisioned endpoint for the service provider network. This set of constraints is perhaps unrealistic for many service provider networks. Further modifications to this model propose the use of an accompanying CGN operated by the service provider to handle those CPE devices that cannot support this address-plus-port function.

This approach has some positive aspects. Pushing the NAT function back to the network edge has some considerable advantage over the approach of moving the NAT to the interior of the network. The packet rates are lower at the edge, allowing for commodity computing to process the NAT functions across the offered packet load without undue stress. The ability to control the NAT behavior with the Internet Gateway Device protocol as part of the Universal Plug and Play (uPnP) framework will still function in an environment of restricted port ranges. Aside from the initial provisioning process to equip the CPE NAT with a port range, the CPE and the edge environment are largely the same as that of today's CPE NAT model.

That is not to say that this approach is without its negative aspects, and it is unclear as to whether the perceived benefits of a "local" NAT function outweigh the problems in this particular model of address sharing. The concept of port "rationing" is a very suboptimal means of address sharing, given that when a CPE is assigned a port range, those port addresses are unusable by any other CPE. The prudent service provider would assign to each CPE a port address pool equal to some estimate of peak demand, so that, for example, each CPE would be assigned some 1024 ports, allowing a single external IP address to be shared across only some 60 such CPE clients. The Carrier-Grade NAT and Dual-Stack Lite approaches do not attempt this form of rationed allocation, allowing the port address pool to be treated as a common resource, with far higher levels of usage efficiency. The leverage obtained in terms of efficiently using these additional 16 bits of address space is reduced by the imposition of a fixed boundary between customer and service provider use. The central NAT model effectively pools the port address range and would result in more efficient sharing of this common pool across a larger client base.

The other consideration here is that this approach means a higher overhead for the service provider, in that the service provider would have to support both "conventional" CPE equipment and address-plus-port equipment. In other words, the service provider will have to deploy a CGN and support customer CPE using a two-level NAT environment in addition to operating the address-plus-port infrastructure. Unless customers would be willing to pay a significant price premium for such address-plus-port service, it is unlikely that this option would be attractive for the service provider as an additional cost above the CGN cost.

Dual-Stack Lite

The concept behind the Dual-Stack Lite approach is that the service provider's network infrastructure will need to support IPv6 running in native mode in any case, so is there a way in which the service provider can continue to support IPv4 customers without running IPv4 internally?

Here the customer NAT is effectively replaced by a tunnel ingress-egress function in the Dual-Stack Lite home gateway. Outgoing IPv4 packets are not translated, but are encapsulated in an IPv6 packet header, which contains a source address of the carrier side of the home gateway unit, and a destination address of the ISP's gateway unit. From the service provider's perspective, each customer is no longer uniquely addressed with an IPv4 address, but instead is addressed with a unique IPv6 address, and provided with the IPv6 address of the provider's combined IPv6 tunnel egress point and IPv4 NAT unit (Figure 8).

Figure 8: Dual-Stack Lite

The service provider's Dual-Stack Lite gateway unit will perform the IPv6 tunnel termination and a NAT translation using an extended local binding table. The NAT "interior" address is now a 4-tuple of the IPv4 source address, protocol ID, and port, plus the IPv6 address of the home gateway unit, while the external address remains the triplet of the public IPv4 address, protocol ID, and port. In this way the NAT binding table contains a mapping between interior "addresses" that consist of IPv4 address and port plus a tunnel identifier, and public IPv4 exterior addresses. This way the NAT can handle a multitude of net 10 addresses, because they can be distinguished by different tunnel identifiers.

The resultant output packet following the stripping of the IPv6 encapsulation and the application of the NAT function is an IPv4 packet with public source and destination addresses. Incoming IPv4 packets are similarly transformed, where the IPv4 packet header is used to perform a lookup in the Dual-Stack Lite gateway unit, and the resultant 4-tuple is used to create the NAT-translated IPv4 packet header plus the destination address of the IPv6 encapsulation header.

The advantage of this approach is that there now needs to be only a single NAT in the end-to-end path, because the functions of the customer NAT are now subsumed by the carrier NAT. This scenario has some advantages in terms of those messy "value-added" NAT functions that attempt to perform deep packet inspection and rewrite IP addresses found in data payloads. There is also no need to provide each customer with a unique IPv4 address, public or private, so the scaling limitations of the dual-NAT approach are also eliminated. The disadvantages of this approach lie in the need to use a different CPE device—or at least one that is reprogrammed. The device now requires an external IPv6 interface and at the minimum an IPv4/IPv6 tunnel gateway function. The device can also include a NAT if so desired, but it is not required in terms of the basic Dual-Stack Lite architecture.

This approach pushes the translation into the interior of the network, where the greatest benefit can be derived from port multiplexing, but it also creates a critical hotspot for the service itself. If the Dual-Stack Lite NAT fails in any way, the entire customer base is disrupted. It seems somewhat counterintuitive to create a resilient end-to-end network with stateless switching environments and then place a critical stateful unit right in the middle!

Protocol Translation

So far we have looked at two general forms of approach to hybrid networks that are intended to support both IPv6 transition and greater levels of address usage in IPv4, namely address mapping and tunneling. A third approach lies in the area of protocol translation.

RFC 2765 [10] contains the details of a relatively simple protocol-translation mechanism. The approach relies on the basic observation that IPv6 did not make any radical changes to the basic IP architecture of IPv4, and that it was therefore possible to define a stateless mapping algorithm that could translate between certain IPv4 and IPv6 packets. Of course the one major problem here is that there are far more addresses in IPv6 than in IPv4, so the approach used was to map IPv4 addresses into the trailing 32 bits of the IPv6 address prefix ::FFFF:0:0/96. The approach assumed that to the IPv6-only end host the entire IPv4 network was visible in this mapped IPv6 prefix, and that when the IPv6-only end host wished to communicate with a remote host who was addressed using this IPv4-mapped prefix it would use a source address also drawn from the same IPv4-mapped prefix. In other words, it assumed that all IPv6-only hosts were also assigned a unique IPv4 address.

The NAT-Protocol Translation (NAT-PT) approach attempted to relax this constraint, allowing IPv6-only hosts to use a dynamic mapping to a public IPv4 address through the NAT-PT function, in the same way as NAT functions work in an all-IPv4 domain (Figure 9). The proposed approach assumed that the local host was located behind a modified DNS environment where the IPv4 "A" record of an IPv4-only remote service is translated by the DNS gateway into a local IPv6 address where the initial 96 bits of the IPv6 address identify the internal address of the NAT-PT gateway and the trailing 32 bits are the IPv4 address of the remote service. When the local host then uses this address as an IPv6 destination address, the packet is directed by the local routing environment to the NAT-PT device. This device can construct an "equivalent" IPv4 packet by using the local IPv4 address as the source address and the last 32 bits of the IPv6 address as the destination address, and bind the IPv6 source port to a free local port value. These sets of transforms can be locally stored as an active NAT binding. Return IPv4 packets can be mapped back into their "equivalent" IPv6 form by using the values in the binding to perform a reverse set of transforms on the IP address and port fields of the packet.

This approach was published as RFC 2766 [11] in February 2000. Some 7 years later in July 2007, the IETF published RFC 4966 [12], deprecating NAT-PT to "historic," with an associated list of applications that would not operate correctly through such a device. This negative judgement of NAT-PT seems rather curious to me, given that conventional CPE NAT functions in IPv4 appear to share most, if not all, of the same shortfalls that are listed in RFC 4966. Given the extensive set of compromises that are required in the environment that is partially crippled by IPv4 address exhaustion, it seems rather contradictory to insist upon extremely high levels of functions and robustness from these hybrid translation approaches.

Figure 9: NAT Protocol Translation – NAT64

Not unsurprisingly, NAT-PT is undergoing a revival, this time under the name "NAT64." Not much has changed from the basic approach outlined in NAT-PT. The IPv6-only client performs a DNS lookup through a modified DNS server that is configured with DNS64. If the queried name contains only an IPv4 address, the DNS64 server synthesises an IPv6 response by merging the prefix address of the NAT64 gateway with the IPv4 address. When the client uses this address, the IPv6 packet is directed to the NAT64 gateway, and the same transform as described previously for NAT-PT takes place.

This setup is similar to the CGN model, in so far as the service provider operates a common NAT that shares an IPv4 address pool across a set of end clients.

ISP Conclusions

There really is no single clear path forward from this point. Different ISPs will see some advantages in pursuing different approaches to this dual problem of introducing IPv6 into their service portfolio and at the same time introducing additional measures that allow more efficient use of IPv4 addresses.

However, one common theme is becoming clear. So far ISPs have been able to "externalize" many of these problems by pushing much of the complexity and fragility of NAT functions out to the customer and loading up the CPE with these functions. This approach of externalizing much of the complexity of address compression in NAT functions over to the customer's network cannot be sustained with the IPv6 transition, and no matter which approach is used, whether it is a CGN, NAT64, Dual-Stack Lite, 6RD, or MPLS with 6PE, the ISP now has to actively participate in the delivery of IPv6 and in increasing the efficiency of the use of IPv4.

So for the ISP it is time to start making some technical choices as to how to address the combination of these two rather unique challenges of transition and exhaustion.

References

[1] Daniel Karrenberg, Gerard Ross, Paul Wilson, and Leslie Nobile, "Development of the Regional Internet Registry System," The Internet Protocol Journal, Volume 4, No. 4, December 2001.

[2] Geoff Huston, "Anatomy: A Look inside Network Address Translators," The Internet Protocol Journal, Volume 7, No. 3, September 2004.

[3] Alex Conta and Stephen Deering, "Generic Packet Tunneling in IPv6 Specification," RFC 2473, December 1998.

[4] Yakov Rekhter, Bob Moskowitz, Daniel Karrenberg, Geert Jan de Groot, and Eliot Lear, "Address Allocation for Private Internets," RFC 1918, February 1996.

[5] Christian Huitema, "Teredo: Tunneling IPv6 over UDP through Network Address Translations (NATs)," RFC 4380, February 2006.

[6] Jonathan Rosenberg, Rohan Mahy, Philip Matthews, and Dan Wing, "Session Traversal Utilities for NAT (STUN)," RFC 5389, October 2008.

[7] Alex Conta, Stephen Deering, and Mukesh Gupta, "Internet Control Message Protocol (ICMPv6) for the Internet Protocol Version 6 (IPv6) Specification," RFC 4443, March 2006.

[8] Mark Townsley and Ole Troan, “IPv6 Rapid Deployment on IPv4 Infrastructures (6rd) – Protocol Specification,” RFC 5969, August 2010.

[9] Jeremy De Clercq, Dirk Ooms, Marco Carugi, and Francois Le Faucheur, "BGP-MPLS IP Virtual Private Network (VPN) Extension for IPv6 VPN," RFC 4659 September 2006.

[10] Erik Nordmark, "Stateless IP/ICMP Translation Algorithm (SIIT)," RFC 2765 February 2000.

[11] George Tsirtsis and Pyda Srisuresh, "Network Address Translation – Protocol Translation (NAT-PT)," RFC 2766, February 2000.

[12] Cedric Aoun and Elwyn Davies, “Reasons to Move the Network Address Translator – Protocol Translator (NAT-PT) to Historic Status,” RFC 4966, July 2007.

Further Reading

The IETF has been working on the issues related to the transition to IPv6 for the past 18 years, and in the intervening period has generated many hundreds of documents. In selecting the following documents as a helpful reading list, I have tried to select only from the more recent documents and those that are overviews of transition technologies rather than reference specifications for individual technologies.

[1] Jari Arkko and Fred Baker, "Guidelines for Using IPv6 Transition Mechanisms during IPv6 Deployment," Internet Draft, Work in Progress, December 2010.

The document discusses the IPv6 deployment models and migration tools, and considers what appears to be effective in networks to date. This Internet Draft,draft-arkko-ipv6-transition-guidelines-14.txt, is about to be published as an Informational RFC.

[2] Brian Carpenter and Sheng Jian, "Emerging Service Provider Scenarios for IPv6 Deployment," RFC 6036, October 2010.

This document describes practices and plans that are emerging among Internet Service Providers for the deployment of IPv6 services, using data collected in a survey of numerous ISPs carried out in early 2010.

[3] Reinaldo Penno, Tarun Saxena, Mohamed Boucadair, and Senthil Sivakumar, "Analysis of 64 Translation," Internet Draft, Work in Progress, draft-ietf-behave-64-analysis-01, January 2011.

This paper is a working document of the IETF's BEHAVE Working Group. The document notes that because of specific problems, NAT-PT was deprecated by the IETF as a mechanism to perform IPv6-IPv4 translation. Since then, new efforts have been undertaken within IETF to standardize alternative mechanisms to perform IPv6-IPv4 translation. This document evaluates how the new translation mechanisms avoid the problems that caused the IETF to deprecate NAT-PT.

[4] Fred Baker, Xing Li, and Kevin Yin, "Framework for IPv4/IPv6 Translation," Internet Draft, Work in Progress, August 2010.

It is common in the IETF these days to generate a "framework" document as part of the process of developing technical specifications. This draft is a framework document for the general IPv4/IPv6 translation technology. This Internet Draft, draft-ietf-behave-v6v4-framework-10.txt, will soon be published as an Informational RFC.

[5] Elwyn Davies, Suresh Krishnan, and Pekka Savola, "IPv6 Transition/Coexistence Security Considerations," RFC 4942, September 2007.

The transition into a dual-stack environment, while attempting to preserve the integrity of a single service regime, presents numerous security concerns. This document is a good overview of such concerns.

[6] Dan Wing and Andrew Yourtchenko, "Improving User Expe-rience with IPv6 and SCTP," The Internet Protocol Journal, Volume 13, No. 3, September 2010.

Building efficient applications in a dual-stack world can be very challenging. It is often the case that poor management of a dual-stack system can make the user experience far slower than just continuing in the IPv4 world. One way to redress this problem is to exchange sequential testing of IPv6 and IPv4 connectivity into a parallel operation—both protocols at once. This article explains the concept.

GEOFF HUSTON, B.Sc., M.Sc., is the Chief Scientist at APNIC, the Regional Internet Registry serving the Asia Pacific region. He has been closely involved with the development of the Internet for many years, particularly within Australia, where he was responsible for the initial build of the Internet within the Australian academic and research sector. He is author of numerous Internet-related books, and was a member of the Internet Architecture Board from 1999 until 2005; he served on the Board of Trustees of the Internet Society from 1992 until 2001. E-mail: gih@apnic.net