Opened 8 months ago

Last modified 4 months ago

#5517 new defect

Posgresql synchronous API calls hang until TCP connections timeout when connectivity is lost

Reported by: tmark Owned by:
Priority: medium Milestone: Outstanding Tasks
Component: database-pgsql Version: git
Keywords: Cc:
CVSS Scoring: Parent Tickets:
Sensitive: no Defect Severity: N/A
Sub-Project: DHCP Feature Depending on Ticket:
Estimated Difficulty: 0 Add Hours to Ticket: 0
Total Hours: 0 Internal?: no

Description

Currently we use synchronous calls for executing prepared statements with all our back ends. Testing with Postgresql shows that when network connectivity is lost calls such as PQExecPrepared() will hang until the TCP time out occurs. Between two VMs hosted on my Ubuntu box this is taking 15 minutes.

There is an asynchronous version of the call, PQsendQueryPrepared, which could be used instead. We should give some consideration to doing DB operations asynchronously as the server is unresponsive during this hang time.

Subtickets

Change History (5)

comment:1 Changed 8 months ago by fdupont

Hum, I am not sure the DHCP server can do many useful things when the connection to the database is broken... And for legal logs it is even worse: broken service can be preferred to broken logs.
BTW we have some timers in not fully implemented DB parameters (2 problems: there are not in the syntax, the unit is not clear e.g. s vs ms between Cassandra and XXXsql). There is a Cassandra parameter ticket where this can (shall!) be addressed.

comment:2 Changed 8 months ago by tmark

This ticket calls for analysis and/or design suggestions prior to any solution attempts.

comment:3 Changed 7 months ago by tomek

  • Component changed from database-all to database-pgsql
  • Milestone changed from Kea-proposed to Kea1.4

As discussed on 2018-02-01 call, moving to 1.4 as medium.

comment:4 Changed 7 months ago by tmark

FYI, Similar testing with MySQL revealed a hang-time of almost 16 minutes. That's troubling.

I also found a way in MySQL to limit the exposure time:

+ int read_timeout = 30;
+ result = mysql_options(mysql_, MYSQL_OPT_READ_TIMEOUT, &read_timeout);
+ if (result != 0) {
+ isc_throw(DbOpenError?, "unable to set read_timeout " <<
+ mysql_error(mysql_));
+ }
+

Using this, the query times out in 30 seconds, if the link is dropped.

Last edited 7 months ago by tmark (previous) (diff)

comment:5 Changed 4 months ago by tomek

  • Milestone changed from Kea1.4 to Outstanding Tasks

Implementing this would require to migrate to async SQL operations. We can do it one day, but that would be dramatic change.

Until we have a very good reason to implement significant time in this, moving to Outstanding for now.

Note: See TracTickets for help on using tickets.