Stateful Network Address Translation is a Cisco IOS® Software feature allowing two or more network address translators to function as a translation group. A backup NAT provides translation services in the event of failure to the active translator. The result is a more resilient IP network.
The goal is to create a more globally resilient IP network. Networked applications are placing increased demands on the core IP network. Users expect continuous access to servers and data, regardless of location. Although the mean time between failure (MTBF) of hardware components has increased, failures can and do occur. Administrative activities can also cause downtime. A resilient IP network offers continuous service, despite failures that may occur.
The concept of a highly resilient IP network is not new; however, this paper introduces an innovative approach. The intelligent systems approach creates a highly optimized, resilient IP network where individual component features interact and share services. The result is a network that is inherently more intelligent and less labor-intensive in terms of design and management. Cisco IOS Software is evolving into a more intelligent, shared function system that helps reduce support costs and increase the benefit and return on investment in network equipment and services.
NAT has been a core Cisco IOS Software feature since its introduction. It has helped to reduce address depletion and promote Internet growth. NAT has been used to permit interconnection of private networks, regardless of their use of independent addressing schemes, even when these schemes use addresses that conflict. NAT has also been used to effectively hide networks from outside the administrative domain while allowing predetermined connections to occur. NAT fulfills an important role and will likely do so even as IPv6 is deployed.
This enhancement can make NAT even more resilient and allow application connectivity to continue, unaffected by potential failures to links and routers at the NAT border. Cisco Stateful NAT provides this enhanced capability.
In IP networking, "stateful" is defined as applying a more global context to the task of forwarding a particular datagram. There is consideration of not just where to forward the datagram, but also of the application/connection state with regard to this datagram. With this knowledge, devices can take action so that potential failures will have less impact on the flow and on the application that is transmitting data. Multiple NAT routers that share stateful context can work cooperatively and thus increase service availability.
STATEFUL NAT OVERVIEW
Stateful NAT (SNAT) allows two or more network address translators to function as a translation group. One member of the translation group handles traffic requiring translation of IP address information. Additionally, it informs the backup translator of active flows as they occur. The backup translator can then use information from the active translator to prepare duplicate translation table entries; therefore, if the active translator is hindered by a critical failure, the traffic can rapidly be switched to the backup. The traffic flow continues since the same network address translations are used, and the state of those translations has been previously defined.
Only sessions that are already statically defined receive the benefit of redundancy without the need for this feature. In the absence of SNAT, sessions that use dynamic NAT mappings would be severed in the event of a critical failure and would have to be reestablished. SNAT enables the maintenance of continuous service for dynamically mapped NAT sessions. The end result is a more resilient IP network.
Cisco is releasing SNAT in phases. Phase I (Cisco IOS Release 12.2(13)) provides a subset of the intended function. Application-level gateway support is not included in Phase I, so protocols that embed IP address data within the payload of the IP packet will not be able to take advantage of the enhanced redundancy provided by SNAT.
Phase II (Cisco IOS Release 12.3(7)) provides increased application-level gateway and asymmetric routing support in SNAT.
Protocols and applications supported in Phase I are:
Any TCP/UDP traffic that does not carry source or destination addresses in the payload
• Internet Control Message Protocol (ICMP)
• rcp, rlogin, rsh
Protocols and applications supported in Phase II are:
• Session Initiation Protocol (SIP); both TCP- and UDP-based
• Trivial File Transfer Protocol (TFTP)
Support for additional protocols may be offered in later releases.
There are additional deployment restrictions for SNAT Phase I. It only function properly when the return traffic path traverses the primary SNAT router. In other words, asymmetrical routing should be prevented. To ensure that return traffic follows a single path to the NAT router, the routing path cost must be adjusted or the Border Gateway Protocol (BGP) metric must be set appropriately. Phase II will allow for asymmetric routing, which will remove the restriction.
Phase II includes additional support for the following:
• Support for outside NAT pools, using the configuration command ip nat outside source pool. SNAT Phase I will only permit inside NAT pools.
• Dynamic entries, which are extended out of static definitions.
• Support for ip nat inside destination.
Scalability for Stateful NAT
There is a potential problem for multiple NAT routers that share stateful context: because Phase II SNAT has no control of Hot Standby router Protocol (HSRP), NAT databases are out of sync between the NAT routers and result in connection losses between end applications.
Scalability for SNAT was integrated into Cisco IOS Software Release 12.4(3), allowing users to enable the feature that allows SNAT to control the HSRP state change until the NAT information is completely exchanged at HSRP mode. Cisco IOS Software Release 12.4(10) will enable scalability for SNAT at both HSRP mode and Primary/Backup mode.
Note: It is highly recommended to run the same Cisco IOS image and be configured with the same NAT configuration, including the global address pools for dynamic NAT, static NAT, and NAT timeout values on SNAT peer routers, to ensure stability and compatibility
Scalability for SNAT can disable queuing during asymmetric routing to avoid delays in the data path for the creation of new entries and traffic on special ports (application-layer gateway support).
SNAT will be supported on all platforms running Cisco IOS Software. Platforms that include hardware acceleration for NAT will benefit, since the mechanism for creating NAT table entries is compatible with the hardware acceleration implementation.
Cisco IOS Software is packaged in feature sets that support specific platforms. To get updated information regarding platform support for this feature, please visit Cisco Feature Navigator at http://www.cisco.com/go/fn/. This application dynamically updates the list of supported platforms as new platform support is added for the feature.
STATEFUL NAT PROTOCOL
SNAT using UDP to communicate NAT table updates between the primary and backup NAT routers was introduced in Cisco IOS Software Release 12.4(3) along with TCP. When UDP mode is used, SNAT will send NAT database exchange information over UDP using proprietary acknowledgement/retransmit mechanism.
Note: SNAT using TCP as the transport mechanism is no longer accepted by Cisco IOS Software Release 12.4(10) and later. TCP configuration will be ignored and replaced by UDP communication by SNAT.
Figure 1 is a SNAT functional diagram. Once configured for SNAT, the UDP session is established between the SNAT peer routers and is used to transmit messages that communicate updates to the NAT tables and maintain session state.
Figure 1. SNAT Functional Diagram
The distributed NAT protocol will ensure that dynamic NAT entries created at the primary or active NAT are duplicated consistently on the backup or standby NAT router. This prepares the backup NAT to take over in the event of a critical failure.
The distributed NAT protocol defines a set of messages that are exchanged between NAT routers:
• Add message-Sent to the peer NAT router whenever traffic flow dictates that a dynamic entry be created locally. The action creates an entry at the recipient's database, based on information in the message. This is also discussed in the Mapping ID section.
• Delete message-Sent to the peer when a dynamic entry is deleted from the local database. The action deletes the corresponding entry at the recipient's database. In SNAT Phase II, the Delete message will be extended to include three types of delete operations:
– Forced-Delete: The recipient will delete the entry.
– Delete-Query: Upon entry-timeout, the Active/Primary that timed out the entry will query the other router as to whether it has received packets later than the NAT router, which is actually running the timer on the entry. In other words, the query permits adjustment of the timer so an entry is not prematurely deleted due to asymmetric flow of traffic.
– Delete-Response: This is sent in response to the Delete-Query. A time-to-restart value is included to adjust the timer on the entry at the Active/Primary that is handling the timers for this entry. A value of 0 in the time-to-restart field will indicate that the recipient has not received packets for this flow later than the Active/Primary.
• Dump-Request message-This message is sent whenever the router comes up asking for the snapshot of the NAT database from the peer NAT router.
• Dump-Reply message-This message is sent in response to the Dump-Request. The message will include the previously learnt dynamic entries from the router that issued the Dump-Request plus the dynamic entries created locally. This is also discussed in the Mapping ID section.
• Update message-Distributes application-specific information (valid only in SNAT Phase II).
• Sync message-Informs the peer of the local SNAT ID number. After the HSRP connection is established between the SNAT peer routers, SNAT starts to send the Sync message. This informs every NAT peer router about the configured SNAT ID number at the peer.
A consistent set of NAT entries is maintained through the exchange of the aforementioned messages. When a SNAT router fails or reloads, it will request a dump of the current NAT entries from the currently active SNAT router upon restoration, and will assume its role in the SNAT group.
It is not recommended to perform dynamic translation clearing on the Standby/Backup router; doing so will cause NAT tables out of sync between SNAT peer routers. If needed, it should be done at the Active/Primary router, and the Active/Primary router will propagate the updates to the Standby/Backup routers automatically.
The mapping id command is used to specify whether the local SNAT router will distribute a particular set of locally created entries to a peer SNAT router.
The logic used for distributing the entries created locally to the peer is as follows:
Each dynamically created entry inherits a mapping ID number based on the configuration setting at the point of creation. This comes from the mapping defined on the NAT rule. For example, entries created using rule ip nat inside source route-map rm-101 pool SNATPOOL1 mapping-id 10 overload will have ID 10 associated with them.
For each SNAT router, a mapping list may also be defined using the command mapping-id within the SNAT configuration as shown below:
ip nat Stateful id 1
Multiple mapping ID statements can be used to form a mapping list. The list specifies which of the entries will be forwarded to peers in that group. It provides a way to specify that entries from particular NAT rules should be forwarded.
Use the command show ip snat distributed verbose to get status information about the SNAT processes. In Example 1: show ip snat distributed verbose shows a router that is configured for HSRP mode and is currently in STANDBY due to the corresponding HSRP group being in STANDBY state. This is because the tracked interface (FastEthernet 0/1) is down; the HSRP group priority is decreased from 105 to 95, its peer router (10.88.194.18) with higher HSRP group priority 100 goes active.
Local virtual MAC address is 0000.0c07.ac00 (default)
Hello time 3 sec, hold time 10 sec
Next hello sent in 0.372 secs
Preemption enabled, delay min 20 secs
Active router is 10.88.194.18, priority 100 (expires in 9.796 sec)
The TCP Control Block (TCB) value for the Stateful NAT using UDP communication mechanism displayed here is a dummy value. This value is consistent with the output of Stateful NAT using the TCP communication mechanism, which is no longer accepted by Cisco IOS Software Release 12.4(10) and later.
Standby router is local
Priority 95 (configured 105)
Track interface FastEthernet0/1 state Down decrement 10
IP redundancy name is "SNATHSRP" (cfgd)
cheney#show ip int brief
Interface IP-Address OK? Method Status Protocol
FastEthernet0/0 10.88.194.17 YES NVRAM up up
FastEthernet0/1 10.88.161.6 YES NVRAM up down
Loopback0 10.88.194.5 YES NVRAM up up
Virtual-Access1 unassigned YES unset up up
This illustrates how SNAT works together with HSRP to achieve improved redundancy.
The current NAT entries can be displayed using the command show ip nat translation. Additional information is shown when the verbose option is included.
NAT entries have been extended to include information about which of the SNAT routers created them, and which router is responsible for the state and timing of that particular entry. The combination of the entry id-number and the SNAT router id-number make each entry unique within the group.
In Example 2: show IP NAT translations, SNAT router "cheney" has two entries numbered 1173 and 1174 that have "left" values counting down from 00:00:35. These entries are timing out. The active SNAT router is responsible for timing out the entries. All three entries are duplicated on a standby SNAT router (capefear1) and are flagged "created-by-remote". This indicates that this router is a backup for these entries, as they are not being timed locally.
Example 2 show IP NAT translations
cheney#show ip nat translations
Pro Inside global Inside local Outside local Outside global
Configuration for SNAT is the same as regular NAT, with some simple additional commands. The first step in defining SNAT is to determine the method of redundancy. SNAT can be configured to work with HSRP by using the IP Redundancy API built into Cisco IOS Software. When HSRP mode is set, the primary and backup NAT routers are elected according to the HSRP standby state. Alternatively, SNAT can be manually defined as primary or backup.
Stateful NAT Interaction with HSRP
The Active and Standby routers are determined from the IP Redundancy API and do not need to be explicitly defined. An example of configuration using IP Redundancy mode is depicted in Example 3. Merely coding redundancy SNATHSRP causes SNAT to make use of the IP Redundancy API. The name is the same as that used in the command standby name SNATHSRP.
Example 3 HSRP Example
ip nat Stateful id 1
ip nat Stateful id 2
The two routers, CHENEY and CAPEFEAR1, form a NAT group. They are designated members of the group by coding the command:
ip nat stateful id <id-number>
Note: id-number is a unique number given to each router in the stateful translation group. Each SNAT router should have a unique ID number.
Establish HSRP as the method of redundancy by coding the command:
Note: SNAT can only listen to one HSRP group. If it is necessary to listen to multiple groups, you need to tie multiple HSRP groups to one group.
Disable asymmetric routing during queuing in HSRP mode by coding the command:
Note: For most of network topologies, the asymmetric routing can be handled by a proper routing configuration. It is recommended to disable asymmetric routing if the router can handle asymmetric routing to improve CPU performance. The asymmetric process is enabled by default.
The Mapping ID section offers more information on how the mapping-id command is used.
The NAT configuration must be configured identically on SNAT peer routers. If the NAT configuration comprises both dynamic NAT and static NAT, the translated IP addresses in the global address pools for the dynamic NAT should not overlap with the translated IP addresses for the static NAT. It is important to not overlap the designated address for the Hot Standby group with the translated IP addresses for both dynamic NAT and static NAT.
Note: For SNAT configuration, the router's interface addresses cannot be used as the translated IP addresses for both dynamic NAT and static NAT.
Primary/Backup mode allows explicit configuration of the primary SNAT router and the backup SNAT router. Each router is defined explicitly, and the IP address of the peer router is specified (Example 4).
Example 4 Primary/Backup Example
ip nat Stateful id 1
ip nat Stateful id 2
The primary command identifies an interface and IP address that the primary SNAT will use as the source for communicating with the backup SNAT router (for sending SNAT protocol messages). Likewise, the backup command does the same for the backup SNAT router. The peer command defines the destination IP address to use for communicating with the peer.
The status of the SNAT configuration can be examined by using the command: show ip snat distributed verbose
If SNAT routers have an entry with the correct peer router ID, please see Example 5. cheney, the active SNAT/HSRP router, has peer address 10.88.194.18 and peer NAT ID 2, which matches its peer standby router capefear1's local address 10.88.194.18 and local NAT ID 2. The communication is established between the SNAT/HSRP routers.
The network example shown in Figure 2 is from a customer test case for deploying SNAT. It was configured within a test lab. In this case, a shared test FTP server is in the Internet that provides services to the test client inside the customer's private network. The customer deploys SNAT to ensure the client can receive the Internet service 24x7.
The FTP server is statically NATed to 192.168.241.12 to hide its private subnet, 126.96.36.199, and provide service to clients in the Internet. The FTP clients in private subnet 188.8.131.52 are dynamically PATed to a pool of addresses ranging from 172.16.201.1 to 172.16.201.10 to gain the Internet access.
The network diagram in Figure 2 shows two SNAT routers, DNS1-INS and DNS2-INS:
• DNS1-INS is the active router and tracks the FastEthernet0 and FastEthernet1 interface states. When DNS1-INS is the active router, the traffic from the hosts (test client PC) to the test FTP server is routed through DNS1-INS.
• DNS2-INS is the standby router and tracks the FastEthernet0/0 and FastEthernet0/1 interface states.
Two aggregate routers, OUT182 and IN12 are connected to SNAT routers and provide LAN connectivity.
Figure 2. SNAT Test Network
The configuration examples include commands required for SNAT and HSRP. These are shown in Examples 6-9.
Example 6 DNS1-INS
ip address 192.168.240.1 255.255.255.0
ip nat inside
! --- Configure the delay period before the initialization of HSRP groups
! --- after the router has reloaded. The feature helps prevent HSRP state flapping
standby delay minimum 60 reload 60
! --- Activate HSRP on the interface
! --- Assign a standby group (1 in this case) and the designated IP address for
! --- the Hot Standby group (192.168.240.3 in this case)
standby 1 ip 192.168.240.3
! --- Assign a priority (105 in this case) to the router interface FE0
! --- for a particular group number (1)
standby 1 priority 105
! --- The HSRP preempt feature enables a router with highest priority to immediately
! --- become the active router at any time.
! --- The preempt delay feature allows preemption to be delayed for a configurable
! --- time period, allowing the router to populate its routing table before becoming
! --- the active router
! --- minimum causes the router to postpone taking over the active role for
! --- a minimum of seconds since the router was last restarted
! --- reload specifies the preemption delay after a reload only
! --- sync specifies the maximum number of seconds to allow IP redundancy clients
Active router is 172.16.200.1, priority 105 (expires in 8.193 sec)
Standby router is local
Priority 95 (configured 95)
Track interface FastEthernet0/0 state Up decrement 20
IP redundancy name is "HSRP_OUT" (cfgd)
Use show ip snat distributed verbose to display SNAT status (Examples 12 and 13). The peer address and peer NAT ID on DNS1-ING SNAT router should match the local address and local NAT ID on DNS2-ING SNAT router. The converse is also true.
A test client PC attached to the LAN in subnet 184.108.40.206 will FTP to the shared test FTP server. Use show ip nat translation to show the content of the translation table (Example 14). The table shows the test client (220.127.116.11) establishing TCP sessions with the test FTP server (18.104.22.168). The active SNAT router, DNS1-ING, translates 22.214.171.124 to 172.16.201.1 dynamically, and translates 126.96.36.199 to 192.168.241.12 statically.
Example 14 DNS1-ING:translation table
DNS1-ING#show ip nat translations
Pro Inside global Inside local Outside local Outside global
The same translation table should be found in the Standby SNAT router, DNS2-ING.
If it is necessary to clear peer SNAT translations from the translation table at the Standby/Backup router, use the clear ip snat translation peer ip-addrees-active-router refresh command in EXEC mode. The key word refresh provides a fresh dump of the NAT table from the Active/Primary router to ensure NAT tables on SNAT peer routers are in sync.
Time-sensitive applications, such as client-server-based applications, might experience timeout and traffic drop. Tuning the HSRP timers is recommended. In the Cisco lab, FTP sessions experience timeout, with standby delay reload 60 and standby preempt delay minimum 60 reload 60 sync 60 configuration. To improve network convergence, the HSRP timers are changed to lower values as shown in Examples 15 and 16.
Lab tests show the delay caused by failover is less than 30 seconds.
NAT for MPLS VPNs
It may be more appropriate for the service provider to handle the NAT function; however, NAT can be deployed on the enterprise edge. An enterprise customer that purchases application services or outsources some portion of the processing workload would likely want to take advantage of NAT services, if possible.
Cisco NAT for MPLS VPNs extends NAT so that service providers can establish the translation function within an MPLS network. This is the subject of a separate paper.
Cisco continues to enhance core features to provide increased benefit in terms of productivity gained from deployment of a more resilient IP network. SNAT can provide higher availability to applications that use NAT services. Cisco will continue to develop more robust and automated features relative to NAT, which will lower administrative costs and increase the return on investment in network technology.