Opened 5 years ago

Closed 4 years ago

#2874 closed task (complete)

Test the Coroutines/RCU approach for resolver multi-threading

Reported by: vorner Owned by: shane
Priority: medium Milestone: bind10-1.2-release-freeze
Component: resolver Version:
Keywords: Cc:
CVSS Scoring: Parent Tickets:
Sensitive: no Defect Severity: N/A
Sub-Project: DNS Feature Depending on Ticket:
Estimated Difficulty: 7 Add Hours to Ticket: 0
Total Hours: 0 Internal?: no

Description

On top of #2871 implement the coroutines/RCU approach and measure its performance.

Each query is a separate coroutine with its own stack, multiple coroutines live in a single thread (and they don't move from one thread to another). If a query waits for something external (eg. network IO), it yields and other coroutine can be run (if there's any other waiting). As many things that can be are thread-private, so they don't have to be locked (we can, for example, have own copy of the network sockets, logging, L1 cache).

Things that need to be locked are locked by mutexes the usual way. The only exception is the cache, that would be done by RCU. Read accesses are without locking, only the updates need lock. Test whether we need multiple locks for parts of the cache or one is enough.

Also, check if we need to „recycle“ used coroutines.

It was suggested to try out the boost.Coroutine library for the experiment, as we already use boost.

More detail can be found in https://lists.isc.org/pipermail/bind10-dev/2013-March/004493.html.

Subtickets

Change History (6)

comment:1 Changed 5 years ago by muks

  • Estimated Difficulty changed from 0 to 7

comment:2 Changed 4 years ago by muks

  • Milestone changed from New Tasks to Sprint-20130723

comment:3 Changed 4 years ago by vorner

  • Owner set to vorner
  • Status changed from new to accepted

comment:4 Changed 4 years ago by vorner

  • Owner changed from vorner to UnAssigned
  • Status changed from accepted to reviewing

Hello

It is ready for review. There are probably two things:

The boost coroutine library is not available on all systems and it needs compiled boost (headers-only is not enough). I added some simple check for boost, but I'm not sure if it is done properly or when it'll break. I'd not like the compilation to break just because of the benchmark, so I'm thinking we either need to do the checks properly (but I did not succeed in learning how to use autoconf properly, I must have some mental block there), or provide an explicit switch to enable the resolver benchmarks ‒ most people don't want them anyway. Yet another option would be not to merge these to master, but to a separate branch, but I don't really like that much.

The other thing, I hope it is not too hard to read. It seems better with these coroutines than with the state-less ones, but it is still kind of mind-bending to try and follow them through the scheduler (they look OK if looked only from inside of the coroutine, so all the complexity is in the scheduler).

comment:5 Changed 4 years ago by shane

  • Owner changed from UnAssigned to shane
  • Status changed from reviewing to accepted

comment:6 Changed 4 years ago by shane

  • Resolution set to complete
  • Status changed from accepted to closed

I've looked through this and other tickets, and used it as input for an architecture proposal which was sent to Comcast. This is used as input to our design for the resolver.

Note: See TracTickets for help on using tickets.