Opened 5 years ago

Closed 4 years ago

#2873 closed task (complete)

Test the landlord/RCU approach for resolver multi-threading

Reported by: vorner Owned by: shane
Priority: medium Milestone: bind10-1.2-release-freeze
Component: resolver Version:
Keywords: Cc:
CVSS Scoring: Parent Tickets:
Sensitive: no Defect Severity: N/A
Sub-Project: DNS Feature Depending on Ticket:
Estimated Difficulty: 6 Add Hours to Ticket: 0
Total Hours: 0 Internal?: no

Description

On top of #2871 implement the landlord/RCU approach.

If there's a shared resource, a thread sits on top of it and does tasks on it. The shared resources are the cache and the network socket towards the client. There are bunch of worker threads too.

Each landlord has an input queue and there's one global queue for the workers. Each time the worker or landlord takes a batch of tasks from the queue, works through them and then appends the results (if any) to other queues (in batches too). Only the queues are protected by mutexes.

The RCU means that the read access to cache does not have to be protected, so it can be done directly from the workers. Only updates need to go through the cache landlord.

More info in https://lists.isc.org/pipermail/bind10-dev/2013-March/004493.html.

Experiment with lengths of the batches and implementations of the queues.

Subtickets

Change History (11)

comment:1 Changed 5 years ago by muks

  • Estimated Difficulty changed from 0 to 6

comment:2 Changed 5 years ago by muks

  • Milestone changed from New Tasks to Sprint-20130423

comment:3 Changed 5 years ago by muks

  • Milestone Sprint-20130423 deleted

Clearing milestone, moving to backlog.

comment:4 Changed 4 years ago by muks

  • Milestone set to Sprint-20130723

comment:5 Changed 4 years ago by vorner

  • Owner set to vorner
  • Status changed from new to assigned

comment:6 Changed 4 years ago by vorner

  • Owner changed from vorner to UnAssigned
  • Status changed from assigned to reviewing

Hello

It is ready for review. The code should be clear enough.

comment:7 Changed 4 years ago by muks

  • Owner changed from UnAssigned to muks

Picking

comment:8 follow-up: Changed 4 years ago by muks

  • Owner changed from muks to vorner

Hi Michal

Here are my comments:

  • This seems to be an unedited copy in landlord.h:
    /// \brief Naive implementation of resolver for the benchmark
    ///
    /// This is here mostly to show how to implement the other benchmark
    /// implementations. Look at the code inside how to use the fake
    /// resolution.
    
  • In NaiveResolver::run(), use prefix ++ operator according to coding guidelines (though it doesn't make a practical difference here) to be consistent. Same in main() too in for loops.
  • It seems that this is not written to be run during check as it is a benchmark. When running resolver-bench manually, I was confused by the output when the last iterations took many seconds and I wondered what it was up to. If we want these experiments to live on in the tree, please section the output with headings (Landlord, Naive, etc.) so we know what implementation is being run as output is printed.
  • While it uses a landlord, this does not use RCU but uses a structure lock for interlocking which runs everything in lockstep. In an RCU implementation, pop() will not wait for the body of push() to complete and vice versa. RCU approach's benchmark results will be different if there's a benefit to using it.
  • This is benchmark code, so I don't know how much nitpicking is valid. I don't like how the return value of pop() is used to indicate shutdown. It's better to use a isRunning() method in the while loops, and in other places.
  • What happens in read_iface() if less than read_batch_size_ queries are received? Is this written so because it is a benchmark, and a real implementation will timeout and push()/flush?
  • Is a second template parameter necessary in LandlordResolver::Queue? Why not simply have the following typedef inside class LandlordResolver::Queue:
    typedef std::vector<T> Container;
    
  • Can't downstream_queue_ be made a stack variable between checkUpstream() and completedUpstream()? Its use and life seems to be within checkUpstream() codepath only.
  • Especially with how the condition variables are used in this branch, I propose creating another ticket that does the following. This will make such code cleaner without having to worry about the relationship between the mutex and the condition variable. The downside is we don't support wait() on unrelated mutexes, but I don't think we have such a usecase.
    • Make Locker constructor take a Lockable (interface) object that implements lock(), tryLock() and unlock() methods. Move Locker out of the Mutex:: class.
    • Make Mutex and CondVar implement the Lockable interface.
    • Like the Python API, Make CondVar constructor take a Mutex argument and remove it from the wait() method arguments. If no mutex is passed, CondVar creates an internal Mutex. The Lockable implementation interface in CondVar wraps around the Mutex methods.
    • Remove mutex_ from LandlordResolver::Queue. The code in Queue methods create Locker around the CondVar.

comment:9 in reply to: ↑ 8 Changed 4 years ago by vorner

  • Owner changed from vorner to muks

Hello

Replying to muks:

  • While it uses a landlord, this does not use RCU but uses a structure lock for interlocking which runs everything in lockstep. In an RCU implementation, pop() will not wait for the body of push() to complete and vice versa. RCU approach's benchmark results will be different if there's a benefit to using it.

Actually, the RCU was never meant for the queues, but for the cache. Notice that cache read is done in the worker thread (without lock), only write is pushed to a landlord ‒ this is simulating the RCU on the cache.

I don't think it makes sense to put RCU onto the queues. Because you need to protect RCU against concurrent writes (by a lock or something) anyway and there's non-zero overhead. And because both push and pop are effectively write operations, RCU doesn't make much sense.

Also, the term „in lockstep“ seems to be misleading. The push or pop are relatively fast operations, compared with the work done with the whole batch. So most of the time the queue should be unlocked and both threads meeting there run in parallel. Just sometimes, one of them drops a batch in there (while the other is probably running at the moment) and later, the other picks that up.

  • This is benchmark code, so I don't know how much nitpicking is valid. I don't like how the return value of pop() is used to indicate shutdown. It's better to use a isRunning() method in the while loops, and in other places.

That would be wrong. The running flag is out of band. It means the sender does not want to push more stuff in. But it does not mean the queue is empty.

  • What happens in read_iface() if less than read_batch_size_ queries are received? Is this written so because it is a benchmark, and a real implementation will timeout and push()/flush?

Yes, this was left out for simplicity of the benchmark. We would allow smaller batches if the network wasn't busy enough in the real resolver. Note that if there are not enough queries to create full batches, the resolver is not fully busy so the performance matters less.

  • Is a second template parameter necessary in LandlordResolver::Queue? Why not simply have the following typedef inside class LandlordResolver::Queue:
    typedef std::vector<T> Container;
    

It's not necessary, but I thought someone might get an idea for different, cool data structure for the queue (like list of small vectors, or something), so we could reuse the code to test.

If you think it is confusing, I can remove it.

  • Can't downstream_queue_ be made a stack variable between checkUpstream() and completedUpstream()? Its use and life seems to be within checkUpstream() codepath only.

It can't. The completeUpstream may be called from read_iface() too and it can be run from future iteration of the checkUpstream(). The query's completeUpstream remembers the address for the time the query is „out there“, and the stack variable would not survive that long.

  • Especially with how the condition variables are used in this branch, I propose creating another ticket that does the following. This will make such code cleaner without having to worry about the relationship between the mutex and the condition variable. The downside is we don't support wait() on unrelated mutexes, but I don't think we have such a usecase.

I think there just can't really be any such usecase and not be broken.

  • Make Locker constructor take a Lockable (interface) object that implements lock(), tryLock() and unlock() methods. Move Locker out of the Mutex:: class.
  • Make Mutex and CondVar implement the Lockable interface.

I don't really like the idea of lockable. Locking a CondVar? is counter-intuitive (especially when more conditional variables are constructed with the same Mutex).

I think it would be better to have a getMutex() method for the CondVar?, so it would be clear a real mutex is involved.

I'll create a ticket for that (at the merge time), although I believe this is very minor unimportant detail.

comment:10 Changed 4 years ago by shane

  • Owner changed from muks to shane
  • Status changed from reviewing to accepted

comment:11 Changed 4 years ago by shane

  • Resolution set to complete
  • Status changed from accepted to closed

I've looked through this and other tickets, and used it as input for an architecture proposal which was sent to Comcast. This is used as input to our design for the resolver.

Note: See TracTickets for help on using tickets.