wiki:April2013MeetingDNSWednesday20130417

DNS meetings on Wednesday, April 17 2013

BIND 10 logging

@1303 hrs

* Review of existing situation

Stephen explained the various log levels:
(README in logging directory describes FATAL errors result in termination. etc.)

Stephen: What do we do with suspect queries? We should maybe be warning about them, but at the moment, all of that is going out as a debug message
If someone can cause a query to BIND and cause a message to be logged, they can cause a DDoS attack.
Shane: Good example is unexpected NOTIFY messages. Right now these are INFO?
Jinmei: We changed to DEBUG.
Jinmei: This topic itself has lot of issues. Better to separate it from other low-level issues (which level of debug we're using)
Stephen: OK. Because of DDoS messages, everything can go down with DEBUG.
Shane: Talked about example of a message that should be DEBUG.
Tomek: What level should I log protocol violations?
Jinmei: What policy do we use for incoming messages in general? That itself is a big issue in the entire loglevel issues.

Stephen: So what level do we log protocol violations like?
Michal: We use a separate logger?
Tomek: I was thinking of a mechanism that would rate-limit log messages on a per-message basis (using the message id)
Stephen: We can have different logger for protocol violations
Tomek: If we go with one logger for protocol violations, 
Michal: One feature I'm missing is a logger that can log into multiple destinations with different log levels
Stephen: We can log to multiple destinations with log4cplus, but it's not possible with our wrapper
Michal: We cannot specify different log levels for them

Shane: I want to address tomek's point. We have to strike a balance. We have to make a decision of what stuff we keep and what we throw away.
Tomek: We can have a knob provided to the administrator to rate limit
Michal: We should be able to filter specific log messages (log this one and don't log this one)
Shane: I like the idea of enabling rate limiting to solve DoS attacks
Jeremy: syslog has some rate limiting built into it
Stephen: So it logs the last message, remembers it and keeps a count of how many times it's been repeated
Mukund: Do we keep track of last timestamps on a per-message-id basis?
All: Yeah

Jeremy: We should be able to ask bindctl about NOTICEs
Michal: If we reload a zone for example, it's async and we never know it failed until we look into the logs

ACTION: We need to design and implement rate limiting

Tomek: Which level should I log "this client got this address" ?
Michal: DEBUG

More logging examples are shown:
DHCPREQUEST
DHCPACK

: This get used for legal reasons. It can't be DEBUG 0.
Stephen: It can go to INFO, and if there's too much of it, it can go to a different logger. If there's still too much of it, we can disable that specific message.
Jinmei: Very frequent ones aren't considered INFO.
Tomek: When there's a protocol violation, do we use WARNING?
Tomek: Either there's an attack in progress, or some broken device on your network.
Mukund: We can't log every type of protocol violation as WARNINGs.

Tomek: For DHCPREQUEST and DHCPACK, what do we do?
Stephen: Log to a different logger, and enable it by default

Jinmei: I think we should log protocol violations at INFO level.
Stephen: Where we have reasonable control of clients, it is at WARNING. Otherwise it is DEBUG.
: We can have different categories of protocol violations.
Stephen: If it is recoverable, it can be at a DEBUG level.

Mukund: Will separate loggers log to different files?
Stephen: It may or may not. There is a lot of flexibility.

* Exceptions and logging

Stephen: Wherever we use exceptions, the text of the message is hardcoded into the program.
Stephen: Should we log where the exception is generated or someplace else?
Michal: I don't think we should log exception by default. It may be worth coupling exceptions with errors.

Stephen: Look at "Parameter not found" for example. It doesn't give you any clue as to what was wrong.

Shane: Stephen put some principles for logging, that all messages happen at one place in the code.
Jinmei: I don't understand what exactly is wrong.
Stephen: We can log line numbers and filenames where the exception happens, in the isc_throw() macro itself
Jinmei: We can provide some more context in the log messages describing the log message, etc. We can help admins that way.

* Local message files

Stephen: You can change the message text of the log messages (translations, etc.) by providing a new message file
Michal: You load auth server, override the log messages, then load the datasrc data source: that overwrites your localized messages
Stephen: Yes you would have to load them again.

* Should message files contain severity and debug level?

Shane, Stephen: No, because when you read the code, you can follow the severity.
Jinmei: Don't mention the log level in the description.
Jeremy: We log the debug level every time we log. That way an admin will know what log level to set when updating config.
ACTION: Create a ticket for this ^^


Phone home

@1419 hrs

Shane: We don't know who's using our software. We are talking about making the software better, but we don't know who our users are and what their problems are.
Shane: Proposal is to add some way for our software to tell us that it's being used.
Shane: One previous proposal was to do a version check (by Michael Graff).
Shane: We want to get info about how people use our software without being sneaky about it. People are more comfortable allowing phone home now.
Michal: This is for something for marketing to figure out, and we just implement it.
Evan: Why is Michael Graff's way not good enough?
Shane: We can get more info such as OS information, etc.
Michal: We should have a way to turn it off.
Shane: We send the information upstream periodically.

Discussion about what the default setting should be happened. Concern that distros will turn it off by default.

Jinmei: What did Stephen mean by anonymization?
Shane: There are questions about how long to save the data, where we store the data collected, etc.
Stephen: We use a UUID, but we cannot find the IP address.

How do we send the data?
Shane: Michael suggested DNS, but we can use HTTP
Michal: we should send our data encrypted
Stephen: What info do we want to gather?
Mark: OS, architecture, OS version number, the output of uname -a with the hostname string
Jeremy: That gives my personal home directory info
Stephen: What are we going to use the info for?
Mark: What versions of BIND are being used, platforms and versions are using this, and what features are being used, configure options
Tomek: What about prefixes where things are installed?
Michal: What use is that?

Jeremy: We have some things collect and send things to us automatically, and have another program that sends us opt-in data by asking the user directly.
Jinmei: Do we know what kind of info that Firefox sends?
about:telemetry lists a lot of performance related data that is sent by a browser to Mozilla Corporation.

Shane: So we have a list of possible info. Initially we'll have 3 levels: completely off, version-only, version + some other info (features, arch, distro, compiler info, etc.)
Evan: Knowing someone is not using some feature may be useful, so we can deprecate it
Jinmei: I think we can simply list the compiler version too

Discussion on data retention. What if an attacker feeds us a lot of junk data if there's no way to uniquely identify a person?

Shane: We need a policy on storing this information.

More discussion on collection of data.

Shane: One idea that may reduce the negative feedback may be if we make the data open.

Discussion

Mark: We can give summary information.
Stephen: People will not have problems with aggregate information being published.

Mark: If we want to track trends over time, we'll need data captured weekly to monthly.

What process sends this information?
Stephen: b10-init can send it

Shane: We should design this thing properly. We should ask our customers first, and if we get a really negative feedback...
Shane: We can write a blog article, post it to bind-users, try to get some interest and feel for what people want

Databases

@1557 hrs

Stephen: DHCP uses MySQL. We need a common database update utility that will work on multiple database types.
Stephen: We want to increase the number of backends available. It seems logical to have common database update code.
Stephen: Do we want binary data in the database?
Shane: No as a lot of our users want to see text data.
Evan: We have both, and keep a hook to keep the blob/text synchronized.

Evan: Storing text and converting to binary has a big overhead. No hard numbers.
Stephen: Should we be thinking about using triggers/stored procedures?
Evan: Ultimately, yes it should be a part of our architecture

Some discussion on PowerDNS and its use of a database.

Jinmei: It may make sense to use binary data for DHCP, and not for DNS. What exactly are we discussing?
Stephen: It's on a per-application basis. Do what you want.

Stephen: Do we want to generalize dbutil?
All: Yes

Some discussion on differences between databases (SQL syntax and performance). Using the same SQL
statement may not be the best approach.

Stephen: Another point is versioning guidelines. At what point does the change in schema be significant?
Michal explained about minor and major versions in our current schema.

Configuration improvements

@1622 hrs

Shane: Jelte did a braindump of configuration improvements right before he left.
Michal: I don't think there's more messy code than bindctl.

Shane: Had a discussion with Scott about user interface problems. Discusssed hiring a contractor for the UI.
Jinmei: The problem with bindctl is not at the UI layer, but the internal representation

Michal: The RESTful API may be the same, even though we change the internals.
Jinmei: At the interface level, we should think about organization of configuration, like the proposal from a few years ago.
Shane: Like the one from Jerry Scharf.

Shane: offline config: I like the plugin idea.
Shane: For custom types, IP addresses would be nice, but I don't think we need extensible types.
Stephen: What's the minimum work that we should do to bring bindctl to an acceptable level?
Shane: We probably want to fix most open bindctl bugs
Michal: If we are going for the minimum, then we get this quality of code because we are rushing.

Discussion that bindctl has a very verbose UI. We need some kind of a front-end.
Last modified 5 years ago Last modified on Apr 26, 2013, 6:47:37 AM