Cisco CNS Network Registrar User's Guide, 5.5
Configuring DHCP Failover
Downloads: This chapterpdf (PDF - 327.0KB) The complete bookPDF (PDF - 5.45MB) | Feedback

Configuring DHCP Failover

Table Of Contents

Configuring DHCP Failover

Failover Configuration Procedure

Configuring Failover Based on Your Scenario

Configuring the Basic Scenario

Configuring the Back Office Scenario

Configuring the Symmetrical Scenario

Avoiding Configuration Mistakes

Maintaining Failover Servers

Monitoring Failover Server Status

Changing General DHCP Configurations

Using the DHCP Failover Configuration Tool

Copying Main Server's Configuration

Loading Batch File on the Backup Server

Confirming Configurations

Handling State Transitions

Moving a Server into PARTNER-DOWN State

State Transitions During Integration

Allocating Addresses Among Servers

Importing Backup Server Leases to the Main Server

Exporting Leases

De-activating Leases in Failover

Changing Failover Server Roles

Making a Nonfailover Server a Failover Main

Replacing a Server Having Defective Storage

Removing a Backup Server and Halting Failover Operation

Adding a Main Server to an Existing Backup Server

Handling Special Cases

Configuring Failover on Multiple Interface Hosts

Maximum Client Lead Time and Lease Period Factor

Changing System Defaults

Supporting BOOTP Clients

Static BOOTP

Dynamic BOOTP

Configuring BOOTP Relays

DHCPLEASEQUERY and Failover

Troubleshooting Failover

Monitoring Failover Operations

Detecting and Handling Network Failures

Setting DHCP Request and Response Packet Buffers Correctly


Configuring DHCP Failover


You can use DHCP failover to configure two DHCP servers to operate as a redundant pair. If one server is down, the other server seamlessly takes over so that new DHCP clients can get addresses and existing ones can renew them. Clients requesting new leases need not know or care about which server is responding to their request for a lease. These clients can obtain leases even if the main server is not operational.

Table 11-1 lists the topics in this chapter and their associated sections.

Table 11-1 DHCP Failover Configuration Topics 

If you want to...
Go to the...

Know more about how DHCP failover works

"DHCP Failover" section

Have an overview of the failover configuration steps

"Failover Configuration Procedure" section

Configure failover based on the three main server scenarios or deployments

"Configuring Failover Based on Your Scenario" section

Use the failover configuration tool to ensure synchronized partners

"Using the DHCP Failover Configuration Tool" section

Determine and handle changes in failover states

"Handling State Transitions" section

Allocate addresses and leases among the failover partners

"Allocating Addresses Among Servers" section

Change the roles of servers

"Changing Failover Server Roles" section

Handle special configuration cases, such as multiple interface server hosts, changing system defaults and the maximum lead time, and supporting BOOTP clients and Relays.

"Handling Special Cases" section

Troubleshoot failover

"Troubleshooting Failover" section



Tip When upgrading from Network Registrar 5.0 or 3.5 to 5.5, see the appropriate Network Registrar Installation Guide chapter for the processing time window of failover and lease binding updates in the DHCP lease state database.


Failover Configuration Procedure

Use the following procedure to configure DHCP failover.


Step 1 Choose your backup configuration scenario—See the "Configuring Failover Based on Your Scenario" section:

Basic—A main server and one backup server, with all scopes backed up to the same server

Back office—Multiple main servers and one backup server, with scopes split between them

Symmetrical—Multiple servers that act as mains and backups for each other based on the scopes

Step 2 Configure your DHCP server and duplicate the configuration on its partner, based on the configuration scenario you selected:

a. Duplicate the configurations for scopes, policies, DHCP options, and addresses on the partner server. Do this manually, or use the failover configuration tool. See the "Using the DHCP Failover Configuration Tool" section.

b. If you enable dynamic DNS update on the main server, ensure that you also enable it on the partner. For help on doing this, see "Configuring Dynamic DNS Update."

c. If you use reservations, ensure that they are identical on each server. For help on doing this, see the "Reserving a Lease" section.

d. If you use client-class, configure both servers with identical client-classes. For help on doing this, see the "Defining Client-Classes and Setting Their Properties" section.

e. Give the scopes the same set of scope selection tags. For help on doing this, see the "Setting Client-Class Scope Selection Criteria" section.

f. Enter the clients in both clusters, or enter them in LDAP and direct both servers to the same LDAP server. For help on doing this, see "Configuring LDAP."

Step 3 Reload both servers.

Step 4 If you use BOOTP relay (IP helpers), configure all BOOTP relay servers to point to both servers—See the "Changing System Defaults" section.


Configuring Failover Based on Your Scenario

In all the following examples, the main and backup servers were already configured identically from a scope, policy, client, and client-class standpoint, and the server-wide default capabilities are used. These examples illustrate only the failover-specific configuration commands.

If you plan to configure failover on a server with multiple interfaces, see the "Configuring Failover on Multiple Interface Hosts" section.


Timesaver You can also use the failover configuration tool for all these scenarios. For details, see the "Using the DHCP Failover Configuration Tool" section.


Configuring the Basic Scenario

The basic failover scenario involves a main server and a single backup server (Figure 11-1).

Figure 11-1 Simple Failover Configuration

Using a CLI Command File

To set up a basic failover configuration, create and run the same command file on both servers:

dhcp enable failover 
dhcp set failover-main-server=dhcpA.example.com. failover-backup-server=dhcpB.example.com. 
dhcp reload 


Note The server that you last reload might return an error that the failover is not available. You can safely ignore this message on startup.


Configuring the Back Office Scenario

The back office failover scenario involves two (or more) main servers and a single backup server (Figure 11-2). The main servers are Aserver and Bserver and the backup server is Cserver.

The main servers have three scopes each: scope1, scope2, and scope3 on one; and scope4, scope5, and scope6 on the other. This scenario is appropriate for scopes on the same LAN segment, which requires the same main and backup server combination. The two sets of scopes are on different LAN segments.

Figure 11-2 Back Office Failover Configuration

Using a CLI Command File

To set up a back office failover configuration, create and run the following command file on Cserver. Run configuration files on Aserver and Bserver with only their appropriate scope data.

scope scope1 set failover=scope-enabled failover-main-server=Aserver.example.com. 
scope scope1 set failover-backup-server=Cserver.example.com. 
scope scope2 set failover=scope-enabled failover-main-server=Aserver.example.com. 
scope scope2 set failover-backup-server=Cserver.example.com. 
scope scope3 set failover=scope-enabled failover-main-server=Aserver.example.com. 
scope scope3 set failover-backup-server=Cserver.example.com. 
scope scope4 set failover=scope-enabled failover-main-server=Bserver.example.com. 
scope scope4 set failover-backup-server=Cserver.example.com. 
scope scope5 set failover=scope-enabled failover-main-server=Bserver.example.com. 
scope scope5 set failover-backup-server=Cserver.example.com. 
scope scope6 set failover=scope-enabled failover-main-server=Bserver.example.com. 
scope scope6 set failover-backup-server=Cserver.example.com. 
dhcp reload 

Configuring the Symmetrical Scenario

The symmetrical failover scenario involves multiple (in this case, two) servers that share network responsibilities by acting as backups for each other based on certain scopes (Figure 11-3). Aserver is the main for scopes 1 through 3 and the backup for scopes 4 through 6. Bserver plays the reverse role.

Figure 11-3 Symmetrical Failover Configuration

Using a CLI Command File

To set up a symmetrical failover configuration, create and run the same command file on both servers.

scope scope1 set failover=scope-enabled failover-main-server=Aserver.example.com. 
scope scope1 set failover-backup-server=Bserver.example.com. 
scope scope2 set failover=scope-enabled failover-main-server=Aserver.example.com. 
scope scope2 set failover-backup-server=Bserver.example.com. 
scope scope3 set failover=scope-enabled failover-main-server=Aserver.example.com. 
scope scope3 set failover-backup-server=Bserver.example.com. 
scope scope4 set failover=scope-enabled failover-main-server=Bserver.example.com. 
scope scope4 set failover-backup-server=Aserver.example.com. 
scope scope5 set failover=scope-enabled failover-main-server=Bserver.example.com. 
scope scope5 set failover-backup-server=Aserver.example.com. 
scope scope6 set failover=scope-enabled failover-main-server=Bserver.example.com. 
scope scope6 set failover-backup-server=Aserver.example.com. 
dhcp reload 

Avoiding Configuration Mistakes

There are several ways to make DHCP failover mistakes:

Enable failover on one server and not on the other.

Configure main and backup servers differently, such as omitting some scopes on the backup.

Fail to reconfigure all of your BOOTP relays to send DHCP packets to both mains and backups.

For the first two mistakes, Network Registrar detects and logs the configuration mistakes during processing, although it may detect a mistake some time after the actual misconfiguration occurred.

Network Registrar cannot detect the third mistake. You can only detect BOOTP configuration errors by performing live tests, akin to fire drills, in which you periodically take the main server out of service to verify that the backup server is available to DHCP clients.

Maintaining Failover Servers

After you configure failover, verify that failover is running correctly by:

Clicking the DHCP server icon in the Server Manager window of the GUI and right-clicking Show related servers in the popup menu or choosing it from the Servers menu.

Looking at the log files and running a Related Servers report. See the "Displaying Related DHCP Servers" section.

When looking at the server log files, there are only two possible roles a server can play, as a main or backup to another server. There is a failover object in the DHCP server for each of these roles, and the object name directly reflects its role:

In the simple configuration (Figure 11-1), server A has one failover object as main for Bserver and server B has one failover object as backup for Aserver.

In the back office configuration (see Figure 11-2), each main server has one failover object as main for Cserver, while server C has two failover objects as backup for both Aserver and Bserver.

When the DHCP server configures itself, it logs every network that is part of each failover object. It also reports the configuration parameters—failover-maximum-client-lead-time, failover-backup-percentage, failover-use-safe-period state, and so on.

Monitoring Failover Server Status

Run the dhcp getRelatedServers command to create a report about the connection status of the main and backup servers. The command displays the following information (in Table 11-2).

Table 11-2 getRelatedServers Report 

This column...
Shows...

Type

Main, Backup, DNS, or LDAP.

Name

DNS hostname.

Address

IP address in dotted octet format.

Requests

Number of outstanding requests, or the failover recovery or DNS update status. In the failover RECOVER state, the column shows the Percent of Failover Recovery yet-to-complete value, starting with 100 at the beginning of the recovery and decreasing to zero, when the partners are again in synch. If the server is in the failover NORMAL state, you can use the dhcp set log-settings=failover-detail command to enable showing the Percent of Failover Bind-Update yet-to-complete value (the percent of configured leases not yet scanned) in this column.

Communications

OK or INTERRUPTED.

Localhost State

Failover state of this server, or two dashes (--) if not applicable.

Partner State

Failover state of the associated failover server, or two dashes (--) if not applicable.


Changing General DHCP Configurations

If you change any of these main DHCP server configurations, you must duplicate them to the backup:

Scopes

Policies

Clients

IP addresses

Reservations

Client-classes

Dynamic DNS updates

Dynamic BOOTP

Namespaces


Tip Consider maintaining your entire DHCP server configuration in CLI command files and always make any changes to those files.


Using the DHCP Failover Configuration Tool

The DHCP failover configuration tool ensures that you can duplicate configurations without having to manually re-enter the data on the backup, saving time and preventing errors. The types of configuration options currently supported by the failover configuration tool are:

Policy properties and DHCP options, including vendor-specific options

Scope properties and ranges

DHCP server properties

Reservations

Clients

Client-classes

Scope selection tags

Extensions

The failover configuration tool process consists of the following steps:

1. Copy the main server's configuration in a batch file, using the cnrFailoverConfig -clone command.

2. Load the CLI batch file on the backup server.

3. Confirm that both servers are identical, using the cnrFailoverConfig -compare command.

Copying Main Server's Configuration

The following procedure clones the main server's configuration.


Step 1 Configure the main server, including saving the configuration and reloading the server.

Step 2 Create a batch configuration file on the main server, at the system command line.

% cnrFailoverConfig -clone -mc cluster -mu user -mp password -o config_file 

The generated configuration file can contain lines such as the following:

dhcp enable client-class 
dhcp disable collect-performance-statistics 
dhcp enable defer-lease-extensions 
dhcp enable discover-interfaces 
dhcp set dns-timeout=5000 


Note If cloning and comparing clients and client-classes, you must set the client-class attribute to enabled, as in the example.



Loading Batch File on the Backup Server

At the system command line of the backup server, access the batch file that you created in the "Copying Main Server's Configuration" section. You may need to enter your username and password. You immediately get the file output, such as:

% nrcmd -b < config_file 
username: admin
password:
100 Ok
session:
cluster = localhost
default-format = user
user-name = admin
visibility = 5
nrcmd>
Set the configuration to match DHCP server localhost
...

If you are satisfied with the output, save the configuration.

% nrcmd 
nrcmd > save 

Confirming Configurations

Perform these steps to compare the configurations of the main and backup servers.


Step 1 Invoke the cnrFailoverConfig -compare command from the system command line. You can be on either the main or backup server.

% cnrFailoverConfig -compare -mc main_cluster -mu main_user -mp main_password 
      -bc backup_cluster -bu backup_user -bp backup_password -verbose -o compare_file 


Note If you perform an additional reverse comparison between the main and the backup server, the results may not be the same, especially if you have an unsymmetrical, back office type failover configuration. See the "Configuring Failover Based on Your Scenario" section.


Step 2 Look at the output file for discrepancies. Because configuration differences between the two servers are marked with the word difference, you can search for that word using a text editor. If you omit the -verbose switch, only the differences appear in the output file.

If there are differences, go back to the "Copying Main Server's Configuration" section and correct the problem.

Step 3 If you are satisfied with the output, reload the backup DHCP server.

nrcmd> dhcp reload 

Step 4 Wait a few minutes, then use the dhcp getRelatedServers command to verify that the two servers are synchronized. See the "Monitoring Failover Server Status" section.

nrcmd> dhcp getRelatedServers 


Handling State Transitions

During normal operation, the failover partners transition between states. They stay in their current state until all the actions for the state transition are completed and, if communication fails, until the conditions for the next state are fulfilled. For a review of the failover states, see the "Failover States and Transitions" section.

Moving a Server into PARTNER-DOWN State

One or both failover partners could potentially move into COMMUNICATIONS-INTERRUPTED state. Fortunately, they cannot issue duplicate addresses while in this state. However, having a server in this state over longer periods is not a good idea, because there are restrictions on what a server can do. The main server cannot re-allocate expired leases and the backup server can run out of addresses from its pool. COMMUNICATIONS-INTERRUPTED state was designed for servers to easily survive transient communication failures of a few minutes to a few days. A server might function effectively in this state for only a short time, depending on the client arrival and departure rate. After that, it would be better to move a server into PARTNER-DOWN state so it can completely take over the lease functions until the servers resynchronize.

There are two ways a server can move into PARTNER-DOWN state:

User action—An administrator sets a server into PARTNER-DOWN state based on an accurate assessment of reality. The failover protocol handles this correctly.

The failover safe period expires—When the servers run unattended for longer periods, they need an automatic way to enter PARTNER-DOWN state.

Network operators might not sense in time that a server is down or uncommunicative. Hence, the failover safe period, which provides network operators some time to react to a server moving into COMMUNICATIONS-INTERRUPTED state. During the safe period, the only requirement is that the operators determine that both servers are still running and, if so, fix the network communications failure or take one of the servers down before the safe period expires.

During this safe period, either server allows renewals from any existing client, but there is a major risk of possibly issuing duplicate addresses. This is because one server can suddenly enter PARTNER-
DOWN state while the other is still operating. Because of this risk, the failover safe period is disabled by default. That is why it is best to enable the safe period only if, during a server failure, it is more important to get an address than risk receiving a duplicate one.

The length of the safe period is installation-specific, and depends on the number of unallocated addresses in the pool and the expected arrival rate of previously unknown clients requiring addresses. The safe period is typically 24 hours, although many environments can support periods of several days.

The number of extra addresses required for the safe period should be the same as the expected total of new clients a server encounters. This depends on the arrival rate of new clients, not the total outstanding leases. Even if you can only afford a short safe period, because of a dearth of addresses or a high arrival rate of new clients, you can benefit substantially by allowing DHCP to ride through minor problems that are fixable in an hour. There is minimum chance of duplicate address allocation, and re-integration after the solved failure is automatic and requires no operator intervention.

Here are some guidelines to follow to decide in using manual intervention or the safe period for transitioning to PARTNER-DOWN state:

If your corporate policy is to have minimal manual intervention, set the safe period. Use the dhcp enable failover-use-safe-period command to enable the safe period and use the dhcp set failover-safe-period command to set the duration (86400 seconds, or 24 hours, by default).

nrcmd> dhcp enable failover-use-safe-period 
nrcmd> dhcp set failover-safe-period=24h 
nrcmd> dhcp reload 

If your corporate policy is to avoid conflict under any circumstances, then never let the backup server go into PARTNER-DOWN state unless by explicit command. Allocate sufficient addresses to the backup server so that it can handle new client arrivals during periods when there is no administrative coverage. Use the dhcp setPartnerDown command, specifying the name of the partner server. This moves all the scopes running failover with the partner into PARTNER-DOWN state immediately, unless you specify a date and time with the command. This date and time should be when the partner was last known to be operational, specified as a unit of time in the past (-s, -m, -h, -d, or -w) or in month-day-hour:minute:second-year syntax.

nrcmd> server dhcp setPartnerDown dhcp2.example.com. -3d 
nrcmd> server dhcp setPartnerDown dhcp2.example.com. Oct 31 00:00:00 2001 
nrcmd> dhcp reload 

State Transitions During Integration

Table 11-3 describes what happens when servers enter various states and how they initially integrate and later re-integrate with each other under certain conditions.

Table 11-3 Failover State Transitions and Integration Processes 

Integrating...
Involves the following...

Into NORMAL state

1. The newly configured backup server contacts the main server, which starts in PARTNER-DOWN state.

2. Since the backup server is a new partner, it goes into RECOVER state and sends a Binding Request message to the main server.

3. The main server replies with Binding Update messages that include the leases in its lease state database.

4. After the backup server acknowledges these messages, the main server responds with a Binding Complete message.

5. The backup server goes into RECOVER-DONE state.

6. Both servers go into NORMAL state.

7. In normal state, each server scans for out-of-date leases in the failover relationship.

8. The backup server sends Pool Request messages.

9. The main server responds with the leases to allocate to the backup server based on the failover-backup-percentage configured.

After COMMUNICATIONS-
INTERRUPTED state

1. When a server comes back up and connects with a partner in this state, the returning server moves into the same state and then immediately into NORMAL state.

2. The partner also moves into NORMAL state.

After PARTNER-DOWN state

When a server comes back up and connects with a partner in this state, the server compares the time it went down with the time the partner went into this state.

If the server finds that it went down and the partner subsequently went into this state:

a. The returning server moves into RECOVER state and sends an Update Request message to the partner.

b. The partner returns all the binding data it was unable to send earlier and follows up with an Update Done message.

c. The returning server moves into RECOVER-DONE state.

d. Both servers move into NORMAL state.

If the returning server finds that it was still operating when the partner went into PARTNER-DOWN state:

a. The server goes into POTENTIAL-CONFLICT state, which also causes the partner to go into this state.

b. The main server sends an update request to the backup server.

c. The backup server responds with all unacknowledged updates to the main server and finishes off with an Update Done message.

d. The main server moves into NORMAL state.

e. The backup server sends the main server an Update Request message requesting all unacknowledged updates.

After PARTNER-DOWN state, continued

f. The main server sends these updates and finishes off with an Update Done message.

g. The backup server goes into NORMAL state.

After the server loses its lease state database

A returning server usually retains its lease state database. However, it can also lose it because of a catastrophic failure or an intentional removal.

1. When a server with a missing lease database returns with a partner that is in PARTNER-DOWN or COMMUNICATIONS-INTERRUPTED state, the server determines whether the partner ever communicated with it. If not, it assumes to have lost its database, moves into RECOVER state, and sends an Update Request All message to its partner.

2. The partner responds with binding data about every lease in its database and follows up with an Update Done message.

3. The returning server waits the maximum client lead time (MCLT) period, typically one hour, and moves into RECOVER-DONE state. For details on the MCLT, see the "Maximum Client Lead Time and Lease Period Factor" section.

4. Both servers then move into NORMAL state.

After a lease state database backup restoration

When a returning server has its lease state database restored from backup, and if it reconnects with its partner without additional data, it only requests lease binding data that it has not yet seen. This data may be different from what it expects.

1. In this case, you must configure the returning server with the failover-recover attribute set to the time the backup occurred.

2. The server moves into RECOVER state and requests all its partner's data. The server waits the MCLT period, typically one hour, from when the backup occurred and goes into RECOVER-DONE state. For details on the MCLT, see the "Maximum Client Lead Time and Lease Period Factor" section.

3. Once the server returns to NORMAL state, you must unset its failover-recover attribute, or set it to zero.

nrcmd> dhcp set failover-recover=0 

After the operational server had failover disabled

If the operating server had failover enabled, disabled, and subsequently re-enabled, you must use special considerations when bringing a newly configured backup server into play. The backup server must have no lease state data and must have the failover-recover attribute set to the current time minus the MCLT interval, typically one hour. For details on the MCLT, see the "Maximum Client Lead Time and Lease Period Factor" section.

(continued)

After the operational server had failover disabled, continued

1. The backup server then knows to request all the lease state data from the main server. Unlike what is described in "After the server loses its lease state database" section of this table, the backup server cannot request this data automatically because it has no record of having ever communicated with the main server.

2. After reconnecting, the backup server goes into RECOVER state, requests all the main server's lease data, and goes into RECOVER-DONE state.

3. Both servers go into NORMAL state. At this point, you must unset the backup server's failover-recover attribute, or set it to zero:

nrcmd> dhcp set failover-recover=0 


Allocating Addresses Among Servers

To keep failover partners operating despite a network partition (when both servers can communicate with clients, but not with each other), allocate more addresses than for a single server. Configure the main server to allocate a percentage of the currently available addresses in each scope to the backup server. This makes these addresses unavailable to the main server. The backup server uses these addresses when it cannot talk to the main server and cannot tell if it is down.

Using the CLI

You can set the percentage of currently available addresses in each scope using the dhcp set failover-backup-percentage, and scope set failover-backup-percentage commands. Note that setting the backup percentage on the server level sets the value for all scopes not set with that attribute. However, if set at the scope level, the backup percentage overrides the one at the server level.

nrcmd> dhcp set failover-backup-percentage=10 
nrcmd> scope set failover-backup-percentage=10 

There is no single default for all environments, although 10 percent is a reasonable one. The percentage depends on the new client arrival rate and the network operator's reaction time. The backup server needs enough addresses from each scope to satisfy all new clients requests arriving during the time it does not know if the main server is down.

Even during PARTNER-DOWN state, the backup server waits for the maximum client lead time (MCLT) and lease time to expire before re-allocating leases. See the "Maximum Client Lead Time and Lease Period Factor" section. When these times expire, the backup server:

Offers leases from its private pool.

Offers leases from the main server's pool.

Offers expired leases to new clients.

During the day, an operator likely responds within two hours to COMMUNICATIONS-INTERRUPTED state to determine if the main server is working. The backup server then needs enough addresses to support a reasonable upper bound on the number of new clients that could arrive during those two hours.

During off-hours, the arrival rate of previously unknown clients is likely to be less. The operator can usually respond within 12 hours to the same situation. The backup server then needs enough addresses to support a reasonable upper bound on the number of clients that could arrive during those 12 hours.

The number of addresses over which the backup server requires sole control is the greater of the two numbers. You would express this number as a percentage of the currently available (unassigned) addresses in each scope.

If you use client-classes, remember that some clients can only use some sets of scopes and not others. See "Configuring Clients and Client-Classes."


Note During failover, clients can sometimes obtain leases whose expiration times are shorter than the amount configured. This is a normal part of keeping the server partners synchronized. Typically this happens only for the first lease period, or during COMMUNICATIONS-INTERRUPTED state.


Importing Backup Server Leases to the Main Server

One way to bring the backup server's lease information to the main server is to import it. Perform the following steps to import the backup server's lease information.

Using the CLI


Step 1 Stop the backup server.

nrcmd> dhcp stop 

Step 2 Enable import mode on the main server.

nrcmd> dhcp enable import-mode 

Step 3 Reload the main server.

nrcmd> dhcp reload 

Step 4 Import the leases into the main server.

nrcmd> import leases leasefile.txt 

Step 5 Disable import mode on the main server.

nrcmd> dhcp disable import-mode 

Step 6 Reload the main server.

nrcmd> dhcp reload 

Step 7 Start the backup server.

nrcmd> dhcp start 


Exporting Leases

Using the CLI

There are two ways that you can export lease information:

Use the export leases command—Exports lease information about the state of all current and expired leases. This command does not identify the lease information as belonging to the main or backup servers. The -server or -client options determine what time format the output should be.

The -client option writes out the lease time as a string in the month, day, time, year format, such as Apr 15 16:35:48 2002.

The -server option writes out the state of all current and expired leases to the DHCP server's log directory using the output file that you specify. It writes lease times as integers representing the number of seconds since midnight GMT Jan 1, 1970, for example, 903968580.You can also specify the namespace as part of the export.

nrcmd> export leases -server -namespace blue -time-numeric leaseout.txt 

Use the export addresses command to export information about every address configured in every server that is specified in the configuration file, or database (to its ip_address table by default). This includes addresses specified in DHCP scope ranges, namespaces, DNS static addresses, and explicitly reserved addresses both for DNS and DHCP servers. However, addresses in scope ranges that are not in use (allocated or reserved) do not appear in the output. The export addresses command also displays the failover role, if any.

nrcmd> export addresses file=out.txt namespace=blue config=config.txt dhcp-only 
       time-ascii 
nrcmd> export addresses database=mcd username=admin password=changeme 
       table=ip_address time-numeric 

De-activating Leases in Failover

When DHCP safe failover is in use and you de-activate a lease with either the GUI or the CLI, it is de-activated only in the cluster to which you are connected. If you want to de-activate the lease on both the main and backup servers, you must connect to and de-activate the lease in each server's cluster.

Changing Failover Server Roles


Caution Be careful when you change the role of a failover server. Remember that all address states in a scope are lost from a server if it is ever reloaded without that scope in its configuration.

Making a Nonfailover Server a Failover Main

You can update an existing installation and increase the availability of the DHCP service it offers. You can use this procedure only if the original server never participated in failover.

Using the CLI


Step 1 Install Network Registrar on the original server and ensure that it operates correctly after the installation.

Step 2 Install Network Registrar on the machine that is to be the backup server. Note the machine's DNS name.

Step 3 Enable failover on the original server. Use the DNS name of the recently installed backup server. See the "Configuring the Basic Scenario" section.

nrcmd> dhcp enable failover 
nrcmd> dhcp set failover-backup-server=backupserver.example.com. 

Step 4 Reload the main server. It should go into PARTNER-DOWN state and stay there. It cannot locate the backup server, because it is not yet configured. There should be no change in main server operation at this point.

nrcmd> dhcp reload 

Step 5 Duplicate the main server's configuration on the backup server, including scopes (including secondary), policies, and client-classes. If you use client-classes, make sure the clients are entered into each cluster or that each server can access an LDAP database with the client data.

Step 6 Enable failover on the backup server.

nrcmd> dhcp enable failover 
nrcmd> dhcp set failover-main-server=mainserver.example.com. 

Step 7 Reconfigure all the operational BOOTP relays to forward broadcast DHCP packets to both the main and backup server.

Step 8 Reload the backup server.

nrcmd> dhcp reload 


After you complete these steps:

1. The backup server detects the main server and moves into RECOVER state.

2. The backup server refreshes its stable storage with the main server's lease data and, when complete, moves into RECOVER-DONE state when it reaches the maximum client lead time (MCLT).

3. The main server moves into NORMAL state.

4. The backup server moves into NORMAL state.

5. The backup server uses a pool request to ask the main server for addresses to allocate if communication is interrupted.

6. After allocating these addresses, the main server sends this data to the backup server.

Replacing a Server Having Defective Storage

If a failover server loses its stable storage (hard disk), you can replace the server and have it recover its state information from its partner.

Using the CLI


Step 1 Determine which server lost its stable storage.

Step 2 Use the dhcp setPartnerDown command to tell the other server that its partner is down. If you do not specify a time, Network Registrar uses the current time.

nrcmd> dhcp setPartnerDown backupserver.example.com. oct 31 13:10 2001 

Step 3 When the server is again operational, re-install Network Registrar.

Step 4 Duplicate the configuration on the server from its partner.

Step 5 Set the failover recovery time to the approximate time when the server failed.

nrcmd> dhcp set failover-recover "FEB 02 13:20 2001" 

Step 6 Reload the replacement server.

nrcmd> dhcp reload 


The following actions then occur:

1. The recovered server moves into RECOVER state.

2. Its partner sends it all its data.

3. The server moves into RECOVER-DONE state when it reaches it maximum client lead time.

4. Its partner moves into NORMAL state.

5. The recovered server moves into NORMAL state. It can request addresses, but can allocate few new ones, because its partner already sent it all its previously allocated addresses.

6. Use the dhcp set failover-recover=0 command on the recovered server, then reload the server.

Removing a Backup Server and Halting Failover Operation

There are times when you might need to remove the backup server and halt all failover operations.

Using the CLI


Step 1 On the backup server, remove all the scopes that were designated as a backup to the main server.

nrcmd> scope scope1 delete 
nrcmd> scope scope2 delete 
...

Step 2 On the main server, remove the failover capability from those scopes that were main for the backup server, or disable failover server-wide if that is how it was configured.

nrcmd> scope scope1 set failover=scope-disabled 
nrcmd> dhcp disable failover 

Step 3 Reload both servers.

nrcmd> dhcp reload 


Adding a Main Server to an Existing Backup Server

You can use an existing backup server for a main server.

Using the CLI


Step 1 Duplicate the main server's scopes, policies, and other configurations on the backup server.

Step 2 Configure the main server to enable failover and point to the backup server.

nrcmd> dhcp enable failover 
nrcmd> dhcp set failover-backup-server=backupserver.example.com. 

Step 3 Configure the backup server to enable failover for the new scopes that point to the new main server.

nrcmd> dhcp enable failover 
nrcmd> dhcp set failover-main-server=mainserver.example.com. 

Step 4 Reload both servers. Network Registrar performs the same steps as those described in the "Making a Nonfailover Server a Failover Main" section.

nrcmd> server dhcp reload 


Handling Special Cases

The following subsections describe handling special failover cases:

Configuring Failover on Multiple Interface Hosts

Maximum Client Lead Time and Lease Period Factor

Changing System Defaults

Supporting BOOTP Clients

Configuring BOOTP Relays

Configuring Failover on Multiple Interface Hosts

If you plan to use failover on a server host with multiple interfaces, you must explicitly configure the local server's name or address. This requires an additional command. For example, if you have a host with two interfaces, serverA and serverB, and you want to make serverA the a main failover server, you must define serverA as the failover-main-server before you set the backup server name (external serverB). If you do not do this, failover might not initialize correctly and tries to use the wrong interface.

Using the CLI

nrcmd> dhcp set failover-main-server=serverA failover-backup-server=serverC 


Note With multiple interfaces on one host, you must specify a hostname that points to only one address or A record. You cannot set up your servers for round-robin support.


Maximum Client Lead Time and Lease Period Factor

You can set two properties for failover that control certain adjustments to the lease period, the maximum client lead time (MCLT) and the lease period factor. These adjustments are essential for failover.

MCLT—Controls the maximum allowed time beyond the expiration of a lease offered a client that the partner server knows the expiration to be. The default MCLT is one hour, which is optimized for most configurations. As defined by the failover protocol, the lease period given a client can never be more than the MCLT added to the most recently received potential expiration time from the failover partner, or the current time, whichever is later. That is why you sometimes see the initial lease period as only an hour, or an hour longer than expected for renewals. This hour is the MCLT, a form of lease insurance. The actual lease time is recalculated when the main server comes back.

The MCLT is necessary because of failover's use of lazy updates. Using lazy updates, the server can issue or renew leases to clients before updating its partner, which it can then do in batches of updates. If the server goes down and cannot communicate the lease information to its partner, the partner may try to re-offer the lease to another client based on what it last knew the expiration to be. The MCLT guarantees that there is an added window of opportunity for the client to renew. The way a lease offer and renewal works with the MCLT is as follows:

a. The client sends a DHCPDISCOVER to the server, requesting a desired lease period (say, three days). The server responds with a DHCPOFFER with an initial lease period of only the MCLT (one hour by default). The client then requests the one-hour MCLT lease period and the server acknowledges it.

b. The server sends its partner a bind update containing the lease expiration for the client as the current time plus the MCLT (one hour). The update also includes the potential expiration time as the current time plus the client's desired period plus the MCLT (three days and an hour). The partner acknowledges the potential expiration, thereby guaranteeing the transaction.

c. When the client sends a renewal request halfway through its lease (in one-half hour), the server acknowledges with the client's desired lease period (three days). The server then updates its partner with the lease expiration as the current time plus the desired lease period (three days), and the potential expiration as the current time plus the desired period and another half of this period (3 + 1.5 = 4.5 days). The partner acknowledges this potential expiration of 4.5 days. In this way, the main server tries to have its partner always "lead" the client in its understanding of the client's lease period so that it can always offer it to the client.

Lease period factor—Controls how much ahead of the client the partner's idea of the lease expiration can be. It is a multiple of the desired lease period used to update the partner when the main server informs it of a lease renewal. Possible values in the range of values are:

1.5—The default and optimized factor. It is the lease period itself plus half again the lease, best used if the renewal period is 50% of the lease.

1.0—Same as the lease period the main server gives the client. Any server can then never offer any client a lease or renewal of more than the MCLT.

2.0—Twice the lease period the main server gives the client.

The lease period factor interacts with the lease renewal period. If the renewal period is more than 50% of the lease, you must also increase the factor. The calculation is as follows:

1 + renew-percentage = factor

Thus, the usual renewal period of 50% might take the default (1 + 0.5 =) 1.5 lease period factor. A renewal period of 80% would more appropriately take a (1 + 0.8 =) 1.8 lease period factor.

You must define the lease period factor for the main DHCP server only. If defined for a partner, the main server ignores it, to enable duplicating the configurations through scripts.

Using the CLI

Generally, if you enabled failover on your DHCP server, you should not change the failover-maximum-client-lead-time, or lease-period-factor, attributes. If you must change the MCLT or the lease period factor, do the following:


Step 1 Reload the backup server to ensure that all data that the backup server has for the main server is up-to-date. Ideally, you should wait until both partners are stabilized, in NORMAL state, and any updates were exchanged. At least wait until the backup server completes its cache update process, as the log file indicates.

nrcmd> dhcp reload 

Step 2 Change the MCLT or lease period factor on the main server. The backup server ignores any MCLT that you configured on it, because it derives its MCLT value directly from the main. The default MCLT is 60 minutes and the default lease period factor is 1.5.

nrcmd> dhcp set failover-maximum-client-lead-time=4800 lease-period-factor=2.0 

Step 3 Stop the backup server.

nrcmd> dhcp stop 

Step 4 Reload the main server.

nrcmd> dhcp reload 

Step 5 Start the backup server.

nrcmd> dhcp start 


Changing System Defaults

You can change some system defaults, such as the number of leases that the main server should send to the backup server, or the MCLT. See the "Maximum Client Lead Time and Lease Period Factor" section. However, you need to change them on both servers.

Using the CLI

On each server:

Change the poll interval—The interval that partners contact each other to confirm network connectivity. The default is 10 seconds.

nrcmd> dhcp set failover-poll-interval=14 

Change the poll timeout—Failover partners who cannot communicate for failover-poll-timeout seconds will conclude that they lost network connectivity, and change their operational states appropriately. The default is 60 seconds.

nrcmd> dhcp set failover-poll-interval=120 

If you enable failover on a UNIX system, you could set the sms-network-discovery attribute to enable the computing client os-type for leased addresses. This can help if you have an Windows partner server and want to use the updateSms command on it.

Supporting BOOTP Clients

You can configure scopes to support two types of BOOTP clients—static and dynamic.

Static BOOTP

You can support static BOOTP clients using DHCP reservations. When you enable failover, remember to configure both the main and the backup server with identical reservations.

Dynamic BOOTP

You can enable dynamic BOOTP clients using the scope name enable dynamic-bootp command. When using failover, however, there are additional restrictions on address usage in such scopes, because BOOTP clients get permanent addresses and leases that never expire.

Using the CLI

When a server whose scope does not have the dynamic-bootp option enabled goes to PARTNER-DOWN state, it can allocate any available (unassigned) address from that scope, whether or not it was initially available to any partner. However, when the dynamic-bootp option is set, each partner can only allocate its own addresses. Consequently, scopes that enable the dynamic-bootp option require more addresses to support failover. When using dynamic BOOTP:

Segregate dynamic BOOTP clients to a single scope. Disable DHCP clients from using that scope with the scope name disable dhcp command.

Use the dhcp set failover-dynamic-bootp-backup-percentage command to allocate a greater percentage of addresses to the backup server for this scope, as much as 50 percent higher than a regular backup percentage.

Configuring BOOTP Relays

The Network Registrar failover protocol works with BOOTP relay (also called IP helper), a router capability that supports DHCP clients that are not locally connected to a server. For details about configuring routers, see the "Configuring a BOOTP Relay Router" section of Chapter 7.

If you use BOOTP relay, ensure that the implementations point to both the main and backup servers. If they do not and the main fails, clients are not serviced, because the backup cannot see the required packets. If you cannot configure BOOTP relay to forward broadcast packets to two different servers, configure the router to forward the packets to a subnet-local broadcast address for a LAN segment, which could contain both the main and backup servers. Then, ensure that both the main and backup servers are on the same LAN segment.

DHCPLEASEQUERY and Failover

To accommodate DHCPLEASEQUERY messages sent to a DHCP failover backup server when the master server is down, the master server must communicate the relay-agent-info (82) option values to its partner server. To accomplish this, the master server uses DHCP failover update messages.

Troubleshooting Failover

This section describes how to avoid failover configuration mistakes, monitor failover operations, and detect and handle network problems.

Monitoring Failover Operations

You can examine the DHCP server log files on both partner servers to verify your failover configuration.

Using the CLI

You can make a few important log and debug settings to troubleshoot failover. Use the dhcp set log-settings=failover-detail command to increase the number and detail of failover messages logged. To ensure that previous messages do not get overwritten, add the failover-detail attribute to the end of the list. Use the no-failover-conflict attribute to inhibit logging server failover conflicts, or the no-failover-activity attribute to inhibit logging normal server failover activity. Then, reload the server.

nrcmd> dhcp set log-settings=default,incoming-packets,missing-options,failover-detail 
nrcmd> dhcp reload 

You can also isolate failover misconfigurations more easily if you use the dhcp getRelatedServers command. See the "Monitoring Failover Server Status" section.

Detecting and Handling Network Failures

Table 11-4 describes some symptoms, causes, and solutions for failover problems.

Table 11-4 Detecting and Handling Failures 

This failover symptom...
Happens because...
And the solution is...

New clients cannot get addresses

A backup server is in COMMUNICATIONS-
INTERRUPTED state with too few addresses

Increase the backup percentage on the main server.

Error messages about mismatched scopes

There are mismatched scope configurations between partners

Reconfigure your servers.

Log messages about failure to communicate with partner

Server cannot communicate with its partner

Check the status of the server.

Main server fails. Some clients cannot renew or rebind leases. The leases expire even when the backup server is up and possibly processing some client requests.

Some BOOTP relay (ip-helper) was not configured to point at both servers; see the "Configuring BOOTP Relays" section

Reconfigure BOOTP relays to point at both main and backup server

Run a fire drill test: take the main server down for a day or so and see if your user community can get and renew leases

SNMP trap: other server not responding

Server cannot communicate with its partner

Check the status of the server.

SNMP trap: dhcp failover configuration mismatch

Mismatched scope configurations between partners

Reconfigure your servers.

Users complain that they cannot use services or system as expected

Mismatched policies and client-classes between partners

Reconfigure partners to have identical policies; possibly use LDAP for client registration if currently registering clients directly in partners.


Setting DHCP Request and Response Packet Buffers Correctly

Failover requires that there be excess response buffers for the number of request packet buffers set. The general rule is that if you set 2000 or fewer request packet buffers, the number of response packet buffers should be about 50 percent (1.5 times) higher. If you set more than 2000 request packet buffers, the response packet number should be about 20 percent (1.2 times) higher. For details on these option settings, see the "Defining Advanced Server Parameters" section.