Opened 3 years ago

Closed 3 years ago

#3692 closed defect (fixed)

Kea4 hangs during processing DHCPDISCOVER in host reservation process

Reported by: wlodekwencel Owned by: marcin
Priority: high Milestone: Kea0.9.1beta
Component: dhcp4 Version: git
Keywords: Cc:
CVSS Scoring: Parent Tickets:
Sensitive: no Defect Severity: N/A
Sub-Project: DHCP Feature Depending on Ticket:
Estimated Difficulty: 0 Add Hours to Ticket: 0
Total Hours: 0 Internal?: no

Description

Server is configured with:
"pool":"192.168.50.10-192.168.50.10"

and reserve that one address:
"hw-address":"ff:01:02:03:ff:04",
"ip-address":"192.168.50.10"

Kea working properly as far as it's get messages from ff:01:02:03:ff:04.
Sending message with different hwaddress brings us to point of no return.
Kea hangs (it's not even creating log file!) with this status:

  PID USER      PR  NI    VIRT    RES    SHR S %CPU %MEM     TIME+ COMMAND                                          
 3118 root      20   0   68660   6108   5196 R 98.7  1.2   7:58.51 kea-dhcp4  

when I try to turn off Kea with:

/usr/local/sbin/keactrl stop

I get:

ERROR/keactrl: Failed to send signal 15 to process kea-dhcp4.

INFO/keactrl: Skip sending signal 15 to process kea-dhcp6: process is not running

INFO/keactrl: Skip sending signal 15 to process kea-dhcp-ddns: process is not running

or reload:

/usr/local/sbin/keactrl reload
ERROR/keactrl: Failed to send signal 1 to process kea-dhcp4.

INFO/keactrl: Skip sending signal 1 to process kea-dhcp6: process is not running

INFO/keactrl: Skip sending signal 1 to process kea-dhcp-ddns: process is not running

and actually any one option to restart Kea is to reboot vm.

Every time I reproduce it I'm getting same number and types of messages in gdb:

(gdb) bt
#0  0x00007fb2395681d7 in vfprintf () from /lib/x86_64-linux-gnu/libc.so.6
#1  0x00007fb23958db29 in vsprintf () from /lib/x86_64-linux-gnu/libc.so.6
#2  0x00007fb2395715a8 in sprintf () from /lib/x86_64-linux-gnu/libc.so.6
#3  0x00007fb23960acac in ?? () from /lib/x86_64-linux-gnu/libc.so.6
#4  0x00007fb23960ad49 in inet_ntop () from /lib/x86_64-linux-gnu/libc.so.6
#5  0x00007fb23ad65813 in isc::asiolink::IOAddress::fromBytes (family=<optimized out>, data=<optimized out>) at io_address.cc:75
#6  0x00007fb23b1e1d58 in isc::dhcp::AllocEngine::IterativeAllocator::increaseAddress (addr=...) at alloc_engine.cc:86
#7  0x00007fb23b1e236f in isc::dhcp::AllocEngine::IterativeAllocator::pickAddress (this=0xcdfa80, subnet=...) at alloc_engine.cc:188
#8  0x00007fb23b1e7804 in isc::dhcp::AllocEngine::allocateLease4 (this=0xcdfaf0, subnet=..., clientid=..., hwaddr=..., hint=..., 
    fwd_dns_update=false, rev_dns_update=false, hostname=..., fake_allocation=true, callout_handle=..., old_lease=...)
    at alloc_engine.cc:678
#9  0x000000000042fd44 in isc::dhcp::Dhcpv4Srv::assignLease (this=this@entry=0x7fff08a6f230, question=..., answer=...)
    at dhcp4_srv.cc:1086
#10 0x00000000004328b3 in isc::dhcp::Dhcpv4Srv::processDiscover (this=this@entry=0x7fff08a6f230, discover=...) at dhcp4_srv.cc:1351
#11 0x00000000004344b3 in isc::dhcp::Dhcpv4Srv::run (this=0x7fff08a6f230) at dhcp4_srv.cc:309
#12 0x000000000041e972 in main (argc=<optimized out>, argv=<optimized out>) at main.cc:165

Issue reproduced on:
Linux ubuntu14.04 3.13.0-24-generic x86_64
Linux debian7 3.2.0-4-amd64 x86_64

Small differences in tests scenario that causes that this issue IS reproduced:

  • add more host reservation entries without changing address pool
  • add more host reservation entries with changing address pool in way that all addresses from pool are reserved.

Small differences in tests scenario that causes that this issue IS NOT reproduced:

  • bigger address pool not fully reserved (even when we add one more address)

My conclusion:
We came across this issue every time all available addresses from configured pool are also reserved.

I can provide remote accessible environment for reproduce and debug.

Subtickets

Attachments (4)

capture.pcap (4.1 KB) - added by wlodekwencel 3 years ago.
configuration_file (1.3 KB) - added by wlodekwencel 3 years ago.
kea_leases.csv (252 bytes) - added by wlodekwencel 3 years ago.
log_file (16.8 KB) - added by wlodekwencel 3 years ago.

Download all attachments as: .zip

Change History (12)

comment:1 Changed 3 years ago by wlodekwencel

Update:
With configuration 192.168.50.10-192.168.50.12 and reservation for all 3 addresses I can get leases for all of those with proper MAC address and then send message with different MAC. It also causes Kea to hang - but I can provide log file

attached below:

  • configuration
  • message exchange capture
  • leases file
  • log file

Changed 3 years ago by wlodekwencel

Changed 3 years ago by wlodekwencel

Changed 3 years ago by wlodekwencel

Changed 3 years ago by wlodekwencel

comment:2 Changed 3 years ago by tomek

  • Milestone changed from Kea-proposed to Kea0.9.1
  • Priority changed from very high to high

comment:3 Changed 3 years ago by marcin

  • Owner set to UnAssigned
  • Status changed from new to reviewing

The bug is fixed now. I checked with Wlodek that his tests are now passing. I also added a unit test to cover this case.

Proposed ChangeLog entry:

XXX.	[bug]		marcin
	libdhcpsrv: Prevent infinite loops in the allocation engine,
	when address pool becomes exhausted.
	(Trac #3692, git abcd)

comment:4 Changed 3 years ago by stephen

  • Owner changed from UnAssigned to stephen

comment:5 Changed 3 years ago by stephen

  • Owner changed from stephen to marcin

Reviewed commit b62a71dba7c8ef053fbd148be5234258facf681f

src/lib/dhcpsrv/alloc_engine.cc
Your fix is OK. However, in my review of #3563, I suggested:

allocateUnreservedLeases6: The loop is a "do while". If "attempts_ == 0" means "UINT_MAX" then:

unsigned int max_attempts = (attempts_ == 0) ? UINT_MAX : attempts_;
for (unsigned int i = 0; i < max_attempts; ++i) {

... is clearer, both in terms of indicating what the upper attempt limit is and in loop logic.

I think the same applies here.

src/lib/dhcpsrv/tests/alloc_engine_unittest.cc
Copyright needs updating to 2015.

ChangeLog
s/, when address/when the address/

comment:6 Changed 3 years ago by marcin

  • Owner changed from marcin to stephen

Ok, I have done what you suggested. Please review.

comment:7 Changed 3 years ago by stephen

  • Owner changed from stephen to marcin

Review commit 6734961cd616c18403dce24591f929f18cd31b96

All is OK, please merge.

comment:8 Changed 3 years ago by marcin

  • Resolution set to fixed
  • Status changed from reviewing to closed

Merged with commit f1e464558c89a6dc88ab28a25dd14a65fee62578ESC

Note: See TracTickets for help on using tickets.