wiki:DeclineDesign

Decline support design

This proposed design seeks to fulfill the requirements specified in DeclineRequirements page.

Overall design

We need a way to store information about addresses that are declined. After certain propation period, they should be recovered. That time should tick even when the server happens to be down. We need a mechanism that:

  • keeps information about an address being unavailable for any client for specified amount of time
  • after that time, the address should become available again
  • that information must be persistent, i.e. not be lost after server shutdown/crash/restart

Fortunately, we already have such a mechanism implemented. It's the normal lease database and the planned lease expiration mechanism (see LeaseExpirationDesign). The idea is to mark a lease as declined upon Decline message reception, set its lifetime to probation time and keep it in the normal lease database. This will meet all the requirements: data will be persistent, the address will not be available to anyone during the probation period and it will be recycled automatically after probation period elapses.

There are multiple ways to mark a lease as declined. With the concept of the lease state (see discussion: https://lists.isc.org/pipermail/kea-dev/2015-August/000420.html and related discussion in the lease expiration design) there are at least two ways of marking lease as declined:

  1. set its state to DECLINED
  2. reassign it to special, well known "client" that represents declined state. This can be implemented as a special hardware address (v4) or DUID (v6).

Step 1 is useful for any new code to easily check whether a lease is declined or not. Step 2 is useful for existing code that may try to reuse or reassign the lease while it is still declined. Since the extra 'state' column will be also useful for lease expiration, it was decided to go along that route. It would be possible to do both 1. and 2., but objections were raised about redundant information. Hence the design calls for setting state to Declined and setting hwaddr/DUID to NULL. Technically speaking, the v6 lease is always expected to have duid_ field set to non-NULL value. Therefore a special representation was implemented for no-duid. See DUID::generateEmpty() method.

There was a proposal to keep client info in the declined lease. That would be wrong for two reasons. First, once the lease is declined, it is no longer associated with the client who declined it. This would essentially be keeping historical information on purpose. The second reason is more practical. Had we kept the client info, there would be a risk for the lease to be used while still being in the declined state. There are many places in the existing code that check whether there's a lease for a given client. All of those checks would have to be augmented with extra conditions (something like: && lease->state != DECLINED). Some of those checks are done in quite complicated scenarios (like recovering from a conflict where client has a reservation, but is currently using a different address). It is much simpler and more technically correct to unassociate the lease from the client. If there's an operational need to determine who declined a lease, that information is available through the logging system.

V4 and V6 Design

We need a way to designate a lease as declined. As discussed in previous section, a new collumn called state will be added. As this column will be used for various purposes, it will be represented as 32 bits wide bitfield. One of the bits will indicate a lease that is declined. A lease that is declined must have all client-identifying information (hwaddr, client-id, duid, hostname and possibly others) removed. The reason for doing so is that once a lease is declined, it is no longer associated with the client who declined it.

The following steps are taken when v4 lease is marked as declined:

  • client-id set to null;
  • hwaddr is set to null;
  • if there was a DNS update conducted, name change request for removing entries is started and all DNS related fields (hostname_, fqdn_fwd_, fqdn_rev_) are cleared;
  • lease lifetime is set to probation period;
  • suitable information (including client details and the address) is logged;
  • hook is triggered;
  • appropriate statistics (subnet-specific declined addresses counter and global total declined addresses) are updated;
  • lease information is updated in the lease database.

The following steps are taken when v6 lease is marked as declined:

  • DUID is set to special DUID object that represents null (see DUID::generateEmpty());
  • hwaddr is set to null (if present);
  • if there was a DNS update conducted, name change request for removing entries is started and all DNS related fields (hostname_, fqdn_fwd_, fqdn_rev_) are cleared;
  • lease lifetime is set to probation period;
  • suitable information (including client details and the address) is logged;
  • lease{4,6}_decline hook is triggered;
  • appropriate statistics (subnet-specific declined addresses counter and global total declined addresses) are updated;
  • lease information is updated in the lease database.

After lease probation time elapses, it will be picked up by the lease expiration routine for processing. See LeaseExpirationDesign. Since lease expiration has been extended with preparatory steps for address affinity, the processing of declined leases being recoved and regular lease reclaimation is slightly different. The following steps are required after declined lease passes its lifetime:

  • appropriate details are logged;
  • lease{4,6}_decline_recover hook is triggered;
  • lease is removed from the database (as opposed to being marked as processed when address affinity is enabled)

Marcin and Tomek discussed this mechanism and came to a rough consensus that the lease expiration code will conduct this "special" processing. This is a first example of special processing needed depending on the lease state. There are envisaged future cases requiring special processing (e.g. failover trigger after lease expiration).

Implementation details (server and allocation engine)

  1. Upon DHCPDECLINE (v4)/Decline (v6) reception, the following steps will be taken:
    • check that the declining client really leased the address it attempts to decline
    • perform dns removal, if applicable
    • trigger lease4_decline (or lease6_decline) hook
    • log the fact of the address being declined, will all necessary details (the address being declined, client's hw addr, probation period, etc.)
    • update the lease: remove hostname, clear dns flags, remove client-id, set HW address to special value indicating declined state, set lease lifetime to the probation period.
    • increase declined-addresses and subnet[id].declined-addresses statistics
  2. Extend reuseExpiredLease4 and reuseExpiredLease(v6 variant) to check whether the lease being reused is marked as declined. If it is, conduct the following extra steps:
    • log the fact of recovering an address from the declined state
    • trigger lease4_decline_recycle (or lease6_decline_recycle) hook
    • decrease declined-address and subnet[id].declined-addresses statistics
  3. Extend lease expiration to cover declined leases. If the lease being recycled is for declined address, the following extra steps are necessary:
    • log the fact of recovering an address from the declined state
    • trigger lease4_decline_recycle (or lease6_decline_recycle) hook
    • decrease declined-address and subnet[id].declined-addresses statistics

Step 1 can be implemented immediately, while steps 2-3 depends on lease expiration implementation. They can be implemented right now, but they will be triggered later, when lease expiration is implemented.

Configuration

A new parameter 'decline-probation-period' will be added to the Dhcp4 and Dhcp6 configurations. It will designate how long an address is kept in the declined state. It will be expressed in seconds and will have a somewhat large default value (1 day seems to be reasonable). Per subnet parameter is currently not planned, but may be added at some later date if per subnet tuning is requested.

Manual recovery

Once an address is declined, log entries, hooks or statistics monitoring may cause the sysadmin to investigate the issue. He may find the device that illegally used an address and solve the issue. He may then want to skip the probation period and recover the address immediately. For that purpose, a new command will be implemented: "declined-address-recover". It will take one parameter "address" that will specify the address being recovered. A example command looks like follows:

{
    "command": "declined-address-recover",
    "arguments": {
        "address": "192.0.2.1"
    }
}

This command will conduct all steps mentioned in previous section in the reuseExpiredLease context (log, trigger hooks, decrease statistics). This command will succeed if such an address was indeed declined and was recovered. It will fail if there was no such address declined, its recovery failed (e.g. because it was recycled already) or because hook callout set flags to prevent recycle.

Information about declined addresses

Information about declined addresses is kept in the database, so the information can be extracted in the "normal" way, which is specific to each database backend. For example, for MySQL backend, a query similar to the following could be used:

> select address, expire FROM lease4 WHERE (state & DECLINED_BIT) = DECLINED_BIT;
> select address, expire FROM lease6 WHERE (state & DECLINED_BIT) = DECLINED_BIT;

Command for getting declined addresses (optional)

If time permits, we may implement an additional command that would list all currently declined addresses. The benefit of having such a command is that it would provide unified interface, regardless of the database backend used. Such a command could be called "declined-address-list" and would take no parameters:

{
    "command": "declined-address-list"
}

The response would contain a list of declined addresses, along with their recovery times:

{
    "result": 0,
    "arguments": {
        "declined-addresses": [ [ "192.0.2.1", "2015-12-24 12:34:45.123" ] ]
    }
}

Note: this command is considered a stretch goal for 1.0. It may or may not make it. If deferred, its actual implementation time will be determined later.

Declined addresses and host reservation

It is possible that a reserved address will be assigned to a host and later reported as duplicate. The design described above will cause the address to become unavailable. For its probation period, the client will be handled a different address from a dynamic pool. After declined address is recovered, the situation should revert to the intended state - the client's lease for dynamic address will be revoked, it will go back to Discover/Solicit? and will eventually get the reserved address. In my opinion, this behavior will work out of the box. One possible tweak could be to alter the log message that says the reserved address is being used by someone else to say that the address is declined, if that is the case.

Implementation tasks

The following is the list of tasks required to implement Decline support in Kea 1.0:

  • #3965 - Extend Memfile Backend to support queries for expired leases (shared with lease expiration)
  • #3966 - Extend MySQL backend to support queries for expired leases (shared with lease expiration)
  • #3967 - Provide upgrade scripts to the 4.0 MySQL schema version (shared with lease expiration)
  • #3968 - Extend PostgreSQL backend to support queries for expired leases (shared with lease expiration)
  • #3969 - Provide upgrade scripts to the 4.0 PostgreSQL schema version (shared with lease expiration)
  • #3981 - Move v4 lease to declined state (DHCPDECLINE support support)
  • #3982 - Move v6 lease to declined state (DECLINE support)
  • #3983 - Decline parameters in configuration (v4 and v6)
  • #3984 - v4 declined lease recovery (required for #3976 - Extend lease reclamation routine to act upon declined leases)
  • #3985 - v6 declined lease recovery (required for #3976 - Extend lease reclamation routine to act upon declined leases)
  • #3986 - Implement lease4_decline hook
  • #3987 - Implement lease6_decline hook
  • #3988 - Implement lease4_decline_recover hook
  • #3989 - Implement lease6_decline_recover hook
  • #3990 - Decline support in User's Guide
  • #3991 - Implement declined-address-recover command for v4
  • #3992 - Implement declined-address-recover command for v6
  • #3993 - Implement declined-address-list command for v4
  • #3994 - Implement declined-address-list command for v6
  • #3995 - Update Design and Requirements with modifications introduced during the implementation and "lessons learned".
Last modified 2 years ago Last modified on Nov 18, 2015, 3:56:54 PM