Opened 4 months ago

Closed 7 weeks ago

#5457 closed enhancement (complete)

Ability to put DHCPv4 packet processing on hold ("park")

Reported by: tomek Owned by: marcin
Priority: medium Milestone: Kea1.4
Component: dhcp4 Version: git
Keywords: Cc:
CVSS Scoring: Parent Tickets:
Sensitive: no Defect Severity: N/A
Sub-Project: DHCP Feature Depending on Ticket:
Estimated Difficulty: 0 Add Hours to Ticket: 0
Total Hours: 0 Internal?: no

Description (last modified by tomek)

Sometimes it would be very useful during the packet processing to tell Kea that a certain packet processing should be put on hold. There are many use cases that could benefit from this, e.g. waiting for external database response, waiting for ping packet to possibly return, waiting for Radius to process the request and send back response etc. The point is that the mechanism should be as generic as possible.

It is recommended that the "park" mechanism be implemented as additional next step value returned by a callout. We already have skip, drop and continue. We should add on-hold, park or similar.

HADesign page discusses the HA application of this feature, but please do keep in mind that this is only one of many possible uses of this.

Its v6 counterpart is #5458.

Subtickets

Change History (22)

comment:1 Changed 4 months ago by tomek

  • Description modified (diff)
  • Summary changed from Ability to park DHCPv4 packet to Ability to put DHCPv4 packet processing on hold ("park")

comment:2 Changed 4 months ago by fdupont

This feature raises the question of compatibility / conflict with the essentially sequential nature of DHCP service.

Some points:

  • conflicts can happen only when assigning resources in the same scope (I took the term scope from Microsoft failover documents: it can be a topology item, i.e. subnet or shared network, or a resource set, i.e. a pool) so stateless services with consistent setups can't conflict, independent subnets / shared networks have independent resource sets. For independent pools it makes more sense for IPv6 addresses where pools are larger than any expected number of clients).
  • conflict can happen only during the resource assignment phase: it is safe (and supported by the DHCP protocol as network delay / reorder) to park a received request before this phase or (even it is less interesting) to park a response after this phase.
  • sequential semantic is not incompatible with parallel processing despite the common idea (I wrote my PhD thesis exactly about this question...).

comment:3 Changed 3 months ago by marcin

Although, I agree that we should make this feature as generic as possible, I suggest that we focus on certain, most obvious use cases first. The most important ones are now:

  1. Sending a lease update to the failover partner before responding to a client.
  2. Ping before offering/acking selected lease.

Out of those two, the most important one seems to be the "lease update". If we're not sure how to avoid issues with potentially offering the same address to two different clients when performing a ping, I'd suggest to split this work into two separate tickets and deal with the first use case first, then add support for pinging in the second ticket.

The way I see addressing the first use case is to allow for parking a packet when pkt4_send returns the PARK status. At this stage, the lease is already allocated/released and the pkt4_send callout can send lease update to the peer using the information from the packet. The lease update is going to be asynchronous, i.e. it will simply schedule asynchronous send and return. When the asynchronous operation completes, the callback will be invoked which will have to trigger a function that pops the corresponding packet from the parking and resume processing. In this particular case, simply send it over the wire. This means that the server should provide the callout with a pointer to the function that pops the parked packet (so as the callback triggered after completion of asynchronous operation can execute it). Alternatively, we could have a common library accessible from the server and the callout. This library could be used to park/unpark packets. The callout could directly unpark the packet and the server would pick it up for further processing.

comment:4 follow-up: Changed 3 months ago by fdupont

Park in pkt4_send fits with the no conflict constraint. IMHO the only technical point to fix is how to identify packets to unpack, for instance adding to packets a counter large enough to never become ambiguous.. Or perhaps the per-request context already does the job? Anyway the lease must be kept before because pkt4_send does not provide it.
BTW I don't believe ping requires a different mechanism: obviously it can't be done before the lease is assigned so pkt4_send again is a reasonable candidate (it has the response to park and the query if the whole processing has to be retried).

comment:5 in reply to: ↑ 4 ; follow-up: Changed 3 months ago by marcin

Replying to fdupont:

Park in pkt4_send fits with the no conflict constraint. IMHO the only technical point to fix is how to identify packets to unpack, for instance adding to packets a counter large enough to never become ambiguous.. Or perhaps the per-request context already does the job? Anyway the lease must be kept before because pkt4_send does not provide it.
BTW I don't believe ping requires a different mechanism: obviously it can't be done before the lease is assigned so pkt4_send again is a reasonable candidate (it has the response to park and the query if the whole processing has to be retried).

Can't you just refer to the packet by its shared pointer ? You have a shared pointer provided to the callout. So something like:

void unpark(const Pkt4Ptr& pkt_to_unpark);

comment:6 in reply to: ↑ 5 ; follow-ups: Changed 3 months ago by fdupont

Replying to marcin:

Can't you just refer to the packet by its shared pointer ? You have a shared pointer provided to the callout. So something like:

void unpark(const Pkt4Ptr& pkt_to_unpark);

=> IMHO it is equivalent to manage the park queue in the hook itself: this is another kind of mechanisms where the hook can divert and insert packets in the processing flow.
The real question I have about this is how the control flow is organized, in particular in presence of I/Os. Note the current code is very flexible (it was modified to accommodate DHCPv4-over-DHCPv6).
I had the same issue with the PCP library and there was no solution which worked in all cases...

comment:7 in reply to: ↑ 6 ; follow-up: Changed 3 months ago by marcin

Replying to fdupont:

Replying to marcin:

Can't you just refer to the packet by its shared pointer ? You have a shared pointer provided to the callout. So something like:

void unpark(const Pkt4Ptr& pkt_to_unpark);

=> IMHO it is equivalent to manage the park queue in the hook itself: this is another kind of mechanisms where the hook can divert and insert packets in the processing flow.
The real question I have about this is how the control flow is organized, in particular in presence of I/Os. Note the current code is very flexible (it was modified to accommodate DHCPv4-over-DHCPv6).
I had the same issue with the PCP library and there was no solution which worked in all cases...

The following is the pseudo code which may deal with the packet parking for the new lease4_update hook point which is somewhere around the pkt4_send hook point:

// SERVER PART

void
Dhcp4Srv::createResponse() {
    selectSubnet();
    acceptPacket();
    assignLease();
    assignOptions();

    callCallouts(Hooks.lease4_update_);
    if (HooksManager::calloutsPresent("lease4_update")) {
        boost::function<void()> next_processing_step = boost::bind(&Dhcpv4Srv::sendResponse, this, dhcp4_exchange);
        callout_handle->setArgument("io_service", io_service);
        callout_handle->setArgument("next_processing_step", next_processing_step);
        HooksManager::callCallouts("lease4_update");
        if (callout_handle->getStatus() != CalloutHandle::NEXT_STEP_PARK) {
            next_processing_step();

        }
    }
}

void
Dhcp4Srv::sendResponse(Dhcp4Exchange& exchange) {
    sendPacket(exchange->getResponse());
}

// HOOK LIBRARY PART STARTS HERE


LeaseCmds lease_cmds;

void local_callback(boost::function<void>() next_processing_step) {
    // do something with client response here

    next_processing_step();
}

int lease4_update(CalloutHandle& handle) {
    boost::function<void()> next_processing_step;
    IOServicePtr io_service;
    handle.getArgument("next_processing_step", next_processing_step);
    handle.getArgument("io_service", io_service);

    HttpClient client(io_service);
    client.asyncSendRequest(..., boost::bind(&local_callback, next_processing_step));

    handle.setStatus(CalloutHandle::NEXT_STEP_PARK);

    return (0);
}

It assumes that the current Dhcpv4Srv::run_one function is split into two functions (can be later split further on). The first function is executed and depending on the next step returned by the callout it can either schedule execution of the next part, i.e. sendResponse, or not. The lease4_update callout is provided with the pointer to the sendResponse function and it is supposed to execute this function when the asynchronous operation completes. Before executing this function, it can first handle the completion of the asynchronous request.

comment:8 in reply to: ↑ 6 Changed 3 months ago by fdupont

Replying to fdupont:

Replying to marcin:

Can't you just refer to the packet by its shared pointer ? You have a shared pointer provided to the callout. So something like:

void unpark(const Pkt4Ptr& pkt_to_unpark);

=> IMHO it is equivalent to manage the park queue in the hook itself: this is another kind of mechanisms where the hook can divert and insert packets in the processing flow.

=> I have an argument against putting the queue in the hook: the queue is lost each time the config is reloaded.

comment:9 in reply to: ↑ 7 ; follow-up: Changed 3 months ago by fdupont

Replying to marcin:
Instead of choosing between Kea processing and the hook about where to implement the control you propose a third option which is Boost Asio... I don't know if Boost Asio provides the whole socket API, for instance sendmsg/recvmsg.
I suggest to do a simple experiment: implement in this way the ping on DHCPOFFERs: we already have the ICMP code in ISC DHCP and it uses RAW sockets, a receive handler and a timeout...

comment:10 in reply to: ↑ 9 Changed 3 months ago by marcin

Replying to fdupont:

Replying to marcin:
Instead of choosing between Kea processing and the hook about where to implement the control you propose a third option which is Boost Asio... I don't know if Boost Asio provides the whole socket API, for instance sendmsg/recvmsg.
I suggest to do a simple experiment: implement in this way the ping on DHCPOFFERs: we already have the ICMP code in ISC DHCP and it uses RAW sockets, a receive handler and a timeout...

Boost does has support for ICMP (http://www.boost.org/doc/libs/1_51_0/doc/html/boost_asio/example/icmp/ping.cpp). The HTTP client which we will be using for HA is implemented based on ASIO.

comment:11 follow-up: Changed 3 months ago by fdupont

I have a proposal: as hook libraries have access to the code export a map of string * list/vector/queue of packets. Each hook will use its queue of parked packets, putting and taking packets from the queue. The string is to have a persistent and easy way to get its queue(s).

About the core processing we need to cut it a bit more so it will be easy to insert a packet before sending for instance. A priori there should be an entry point after each callout point.

For the NEXT_STEP itself I suggest to wait for #5443 completion to see if we need a new one and what will be its behavior.

(PS: I suggest ParkingLots? for the map name)

comment:12 in reply to: ↑ 11 Changed 3 months ago by marcin

Replying to fdupont:

I have a proposal: as hook libraries have access to the code export a map of string * list/vector/queue of packets. Each hook will use its queue of parked packets, putting and taking packets from the queue. The string is to have a persistent and easy way to get its queue(s).

About the core processing we need to cut it a bit more so it will be easy to insert a packet before sending for instance. A priori there should be an entry point after each callout point.

For the NEXT_STEP itself I suggest to wait for #5443 completion to see if we need a new one and what will be its behavior.

(PS: I suggest ParkingLots? for the map name)

This sounds like a good idea. Could you create a simple PoC for this? I'd be happy to review it and see how it fits out current use cases.

comment:13 Changed 3 months ago by marcin

  • Owner set to marcin
  • Status changed from new to accepted

As part of #5459 I created a new hook point, leases4_committed, which will be used to trigger lease updates. As it is appears to be hard to decouple #5459 from this ticket I am willing to do them together.

comment:14 Changed 3 months ago by marcin

  • Owner changed from marcin to UnAssigned
  • Status changed from accepted to reviewing

Sometimes it would be very useful during the packet processing to tell Kea that a certain packet processing should be put on hold. There are many use cases that could benefit from this, e.g. waiting for external database response, waiting for ping packet to possibly return, waiting for Radius to process the request and send back response etc. The point is that the mechanism should be as generic as possible.

It is recommended that the "park" mechanism be implemented as additional next step value returned by a callout. We already have skip, drop and continue. We should add on-hold, park or similar.

HADesign page discusses the HA application of this feature, but please do keep in mind that this is only one of many possible uses of this.

Its v6 counterpart is #5458.

comment:15 Changed 3 months ago by fdupont

  • Owner changed from UnAssigned to fdupont

comment:16 follow-up: Changed 3 months ago by fdupont

  • Owner changed from fdupont to marcin

The Dhcp4Hooks of dhcp4_srv.cc is not in an anonymous namespace. There should be a comment explaining why it is an exception.

I am not convinced the io_service (*) should be given as a callout argument as it is available from the controlled server. In general I am against passing global values via a callout: it is simpler to provide a documented access to them.
(*) please renamed it into io_context because it will be its official C++ name as soon as the standard is adopted and published.

Fixed trivial typos in the .dox file: please pull.

Should be the check for rsp before the lease4 committed callout? IMHO there should be one before and a second after.

Please choose between:
Lease4CollectionPtr new_leases(new Lease4Collection());
and
Lease4CollectionPtr deleted_leases(new Lease4Collection);
(I have no opinion about the best but you can't use both at a few line difference...)
(in fact I have a preference: as the second seems valid it is simpler so a priori better now it is more a question of style...)

The first time I was not happy to see the park feature not allowed for DHCPv4-over-DHCPv6 but I remembered I disabled the delayed ask for ISC DHCP for 4o6 too so I defer (followup ticket?) the question.

Please use an uniform style (I leave the choice to you) for includes in callout_library_X.cc.

Finished src/bin/dhcp4 review.

comment:17 follow-up: Changed 3 months ago by fdupont

The documentation should say somewhere the type parameter T of packing lots must be a shared pointer, not a reference to a shared pointer, so its object is not destroyed too soon. BTW I expect C++ compilers implement closures the standard way so embedded shared pointers count for an access. If not destroying the parking info before calling the callback will give unexpected results.
BTW the HooksManagerTest::Parking can safely use a string because the whole test is in the scope where the string is created.

There are missing doxygen stuff. I am adding them. BTW IMHO the tparam should be first as it is the first parameter.

I noted you always capture this in closures (lambda in C++). I don't know why: in most cases it does not bring something. I propose to remove these "this"s and to verify it still works well.

I fixed what I believe to be a typo in parking_lots_unittest.cc but please check.

To summary: there are a few points I'd like to discuss but IMHO we can merge and postpone the discussion as none of these points are critical. There should be a ChangeLog entry (at least to show process) and I'd like a follow up ticket for DHCPv4-over-DHCPv6 (investigate, explain if it can't be done, implement if it can be done).
I'll reset the #5758 branch.

Doxygen happy, build worked, unit tests passed.

comment:18 in reply to: ↑ 16 ; follow-up: Changed 3 months ago by marcin

Replying to fdupont:

The Dhcp4Hooks of dhcp4_srv.cc is not in an anonymous namespace. There should be a comment explaining why it is an exception.

I put the Dhcp4Hooks into anonymous namespace now.

I am not convinced the io_service (*) should be given as a callout argument as it is available from the controlled server. In general I am against passing global values via a callout: it is simpler to provide a documented access to them.
(*) please renamed it into ii_context because it will be its official C++ name as soon as the standard is adopted and published.

Currently, the IOService is defined in the Dhcpv4Srv class. I prefer that the hook libraries have minimal number of dependencies. The new hook point is going to be useful anyway, as it provides additional parameters beyond io_context. So I decided to leave it as it is.

Fixed trivial typos in the .dox file: please pull.

Thanks.

Should be the check for rsp before the lease4 committed callout? IMHO there should be one before and a second after.

Not really, because there is no response for the DHCPRELEASE case and I you still want to run the callouts.

Please choose between:
Lease4CollectionPtr new_leases(new Lease4Collection());
and
Lease4CollectionPtr deleted_leases(new Lease4Collection);
(I have no opinion about the best but you can't use both at a few line difference...)
(in fact I have a preference: as the second seems valid it is simpler so a priori better now it is more a question of style...)

I now use the first form, as it is what we use everywhere else.

The first time I was not happy to see the park feature not allowed for DHCPv4-over-DHCPv6 but I remembered I disabled the delayed ask for ISC DHCP for 4o6 too so I defer (followup ticket?) the question.

I am aware of that, but it is out of scope for this ticket.

Please use an uniform style (I leave the choice to you) for includes in callout_library_X.cc.

Ok, the style of lib 3 is now used in 1 and 2.

Finished src/bin/dhcp4 review.

comment:19 in reply to: ↑ 17 ; follow-up: Changed 3 months ago by marcin

  • Owner changed from marcin to fdupont

Replying to fdupont:

The documentation should say somewhere the type parameter T of packing lots must be a shared pointer, not a reference to a shared pointer, so its object is not destroyed too soon. BTW I expect C++ compilers implement closures the standard way so embedded shared pointers count for an access. If not destroying the parking info before calling the callback will give unexpected results.

I added a note.

BTW the HooksManagerTest::Parking can safely use a string because the whole test is in the scope where the string is created.

There are missing doxygen stuff. I am adding them. BTW IMHO the tparam should be first as it is the first parameter.

There is no requirement in doxygen for the tparam to be before param. While I understand your point, in most of the places in the code we first put params and then tparam.

I noted you always capture this in closures (lambda in C++). I don't know why: in most cases it does not bring something. I propose to remove these "this"s and to verify it still works well.

Good point. Removed this.

I fixed what I believe to be a typo in parking_lots_unittest.cc but please check.

Thanks.

To summary: there are a few points I'd like to discuss but IMHO we can merge and postpone the discussion as none of these points are critical. There should be a ChangeLog entry (at least to show process) and I'd like a follow up ticket for DHCPv4-over-DHCPv6 (investigate, explain if it can't be done, implement if it can be done).
I'll reset the #5758 branch.

Doxygen happy, build worked, unit tests passed.

Proposed ChangeLog entry:

1349.	[func]		marcin
	Implemented new hook points dhcp4_srv_configured and
	leases4_committed in DHCPv4 server. The latter supports
	new next step status NEXT_STEP_PARK which causes the server
	to "park" the clients' DHCP packet.
	(Trac #5457, git cafe)

comment:20 in reply to: ↑ 18 Changed 3 months ago by fdupont

Replying to marcin:

Currently, the IOService is defined in the Dhcpv4Srv class.

=> and its derived class ControlledDhcpv4Srv() has a getInstance() class method to provide access. (I checked when I worked on 5758). Now it is a matter of taste so we can discuss it later.

The first time I was not happy to see the park feature not allowed for DHCPv4-over-DHCPv6 but I remembered I disabled the delayed ask for ISC DHCP for 4o6 too so I defer (followup ticket?) the question.

I am aware of that, but it is out of scope for this ticket.

=> I am creating a ticket to look at it.

comment:21 in reply to: ↑ 19 Changed 3 months ago by fdupont

  • Owner changed from fdupont to marcin

Replying to marcin:

There is no requirement in doxygen for the tparam to be before param. While I understand your point, in most of the places in the code we first put params and then tparam.

=> there is a convention to put param entries in the same order than parameters and tparam is too close to param to not follow the same rule. So I propose tparam, param, throw/return order for new code.

Proposed ChangeLog entry:

1349.	[func]		marcin
	Implemented new hook points dhcp4_srv_configured and
	leases4_committed in DHCPv4 server. The latter supports
	new next step status NEXT_STEP_PARK which causes the server
	to "park" the clients' DHCP packet.
	(Trac #5457, git cafe)

=> seems fine. I launched a check using pulled branch but IMHO you can merge now (if something goes bad I'll jabber).

comment:22 Changed 7 weeks ago by marcin

  • Resolution set to complete
  • Status changed from reviewing to closed

Finally merged with commit af43f07b0e227ccabcdf07a046a64cebb11bdccf.

Note: See TracTickets for help on using tickets.