Proposed plan for year 3

RIPE meeting BIND 10 stuff

  • NSCP discussion
  • AFNIC BIND 10 time possible
  • BIND 10 live demo
  • Interest in PostgreSQL data source
  • Interest in using BIND 10 with a data source that calls scripts to build answers
  • NSD 4 coming out this year


Michal: Ticket for next sprint? Why it fails?

Stephen: Related to the next topic "handling build errors". Why chases up an error? Sometimes errors are fixed directly in master... is that the right thing to do?

Michal: If it doesn't go into master we don't know if it is fixed.

Shane: We've always had the desire to do build on other branches.

Shane: Are we masking problems with virtual machines?

Jeremy: These also happen on real machines. I sent an e-mail 20 minutes ago.

Jinmei: My concern is why we haven't done anything for this, even though we have seen this for quite a long period. My original impression was that this was a clock issue on virtual machine, and suggested applying a filter. But this is better than keeping it happening. Or someone can delve into the details of the failure, but noone has done that.

Michal: I guess because we had some timing issues on virtual machines before and these were filtered out. And everybody looked and said "another timing issue".

Stephen: It happens, and it's nobody's responsibility. That's why I raised the issue of handling build errors.

Jeremy: In this specific case, there were multiple timer tests that were added about the same time. The first failed frequently, so I added a filter. Then I did not add a filter to the 2nd one. Then I noticed we have the exact same failure on real hardware, so it does not make sense to keep adding filters.

Jelte: I think it is slow machines, not virtual machines.

Michal: It should not need a fast machine.

Jelte: In my case the tests came in the wrong order, so it had to be seconds.

Michal: Maybe we have some race conditions?

Jelte: The race conditions are in the tests, not the code.

Shane: Okay, we need a ticket for this issue. Then talk about build issues in general...

Jeremy: We have 6 filters in place, so maybe 6 tickets.

Shane: I meant figuring out an alternate test strategy and hopefully removing filters.

Handling Build Errors

Stephen: If you can recognize it, maybe you can pick it up. But should fixes be formally reviewed? Posted into Jabber... should we do it as a branch.

Michal: If it's put into jabber then it is reviewed. Creating branch can be bigger than the task itself.

Stephen: Sometimes it's useful to see the context of a change.

Stephen: The other thing is trying to avoid this in the first place. If you have a problem it tends to occur on a machine you do not have access to. Should we be able to build branches on the test machines?

Jeremy: Yes. And this sprint has a task for testing the new logger - log4cplus. As part of that I was going to instrument the build farm within the next week.

Jelte: Will not fix all the possible occurances - since it can be a mix of 2 branches - but it should most all of them.

Jeremy: About login... you can ask for them. At least one of the machines is my personal machine so you can't login easily. The goal was to get outside people to run machines on the build farm, so you would not be able to log into them either.

Stephen: Did we have a list of operating systems that we guaranteed to support?

Jeremy: There should be something in the wiki. What is supported is in the guide. It's not complete since we support more since that was maintained.

Stephen: As we release code, we cannot support some machines. But if it fails on a machine that we do support, we can login and do some debugging on it to find out why it fails.

Jinmei: There are several issues regarding how we handle build failures.

  • The size matters. In many cases fixing the regression is trivial, for example a 2-line patch.
  • We already have an informal rule that if the change is quite trivial we can use the developer jabber for review and commit directly.
  • Unlike normal tasks or bugs, a build failure is more urgent. Even if it is sub-optimal to skip creating a ticket and reviewing it, it is a reasonable compromise to fix it as quickly as possible. Creating a ticket may be a good idea but I would rather fix it sooner than later.

Jelte: +1

Shane: If a ticket causes build breakage, maybe we should re-open the ticket and put the fixes on that?

Jeremy: We have 3 existing failures:

  • Timing
  • Botan in Sunstudio
  • A new one introduced today with log compiler

b10-stats-httpd design

Jeremy: A query returns all stats. In BIND 9 this may take 0.25 MByte returned. For both BIND 9 and BIND 10 you should be able to query on specific stats. This can be a RESTful interface. Should be simple.

Michal: Why not just write the location, like /zone/$name_of_zone. I think this can be added there quite easily, but it is not done yet.

Shane: Sounds like it should be a separate ticket.

stats configuration and agnostic stats daemon

(see jelte's comments in #719).

Jelte: Currently stats are passed like configuration values. #719 addresses that but there are some design issues. Kambe agreed, but it is a small improvement, so should we merge this and go on with a new ticket or start over.

Jinmei: It will depend on the details. But my impression is that we should merge it anyway and see the effects of it while thinking about further improvement.

Jinmei: Are there other modules that will depend on this change? Should we change the implementation of the cfgmgr or something?

Jelte: Not in this change, but we will in the proposed one.

Botan incompatibilities

Jeremy reports: "Version we support is not in default install for Red Hat nor Debian (and Pkgsrc when needing clang++). Only development version installs on our Sun system (I gave feedback to Botan) but it is not backward compatible (yet?). And also still have problems with Sunstudio builds of Botan and for BIND 10 using Botan. (I have been giving feedback to Botan developers.)"

Jeremy: We're stuck at a specific version that is not easily available everywhere.

Michal: So we should be more flexible?

Jelte: We can add pre-compilation checks for the API used. But as soon as you go past 2 or 3 versions this becomes a big burden. For 1 or 2 that's okay though.

Jinmei: We have a last resort plan to disable it completely.

Michal: Some people will need Botan though...

Stephen: Isn't this a general problem with dependencies? We will always find a platform that is not supported. We had this for log4cxx. We backed out the change, and we're still coping with that.

Jelte: Yeah I'm more inclined to support 2 or 3 versions of the API in this case.

Stephen: But what is the minimum version we will & can support?

Jeremy: To be clear, log4cxx brought in extra dependencies.

Stephen: If the dependencies worked on all the systems, it would not have been a problem.

Stephen: This is validating a decision that we made to do a series of tests to make sure it works on all of our platforms. That's something we have for log4cplus.

Larissa: I have the task to figure out what platforms we support (for another project). I can share that.

Jeremy: We could use the definition "everywhere that BIND 9 and ISC DHCP is supported".

Shane: I don't want to do that because BIND 9 is old!!!

Larissa: We'll drop some old systems.

Shane: Is the plan to use autoconf for supporting multiple versions?

Jelte: I can easily support 1.8 and 1.9. However if we want to support 1.7 or 1.6 then I'm getting reluctant about it.

Jinmei: Is it worth considering using a different crypto library?

Michal: I don't know if it would be less work. Any library will have some issues.

Stephen: Any word when 1.8 will be supported on all the systems we're interested in?

Jeremy: Not specifically.

Larissa: We still have a huge user base on Solaris.

Stephen: If the time scales are a few months, we can disable TSIG on the systems that don't support Botan 1.8.

Jeremy: I know 2 failures for 1.9 so I can create tickets for that.

Jelte: Suggest make 1 ticket.

Review of current task list

Stephen: Have a number of tasks assigned, but only 7 tasks not assigned to anyone. Wanted to make sure we don't have people missing work to do.

Jinmei: Many things depend on configuration of TSIG, #811, so it would be nice if we could make progress on that.

Jelte: I was waiting on related tickets to be merged.

introduction to some new test utilities

  • util/unittests/testdata
  • util/unittests/textdata
  • util/unittests/newhook
  • testutils/dnsmessage_test
  • lib/dns/test/testdata/

(TODO on mailing list)

Last modified 7 years ago Last modified on May 10, 2011, 8:14:51 PM