wiki:DataSourceDesign

Summary

This is a proposal of a revised data source design. It (hopefully)

  • supports both in-memory and database-drive data sources reasonably, with fewer redundancy
  • supports all major operations we'll need (for year 3 goals): basic lookups, intra-zone iteration, updates
  • is simpler and easier to understand and test
  • is easier to support other types of database backends than SQLite3 (there's even a sample MySQL data source in an experimental branch)
  • runs at least as fast as the current design/implementation

Issues with the design/implementation of the current data source

  • The query logic (DataSrc::doQuery() and subroutines called from it) mixes DNS specific logic and specific tricks for database-driven data sources (like running a label-by-label loop to identify a best match). This makes it harder to share the logic with more efficient backend, specifically in-memory data source. It also makes it more difficult to test (as we can see in datasrc_unittest, we need to prepare and use a DNS message to test the data source behavior. It's not impossible, but in general it's more difficult to test if it requires more prerequisite setups/dependencies). And, IMO, the resulting implementation is quite complicated and difficult to understand (at least to me, it was difficult to be sure if a fix to a problem in the DataSrc class is sufficient for bugs like #851).
  • Specific lowest data source (right now it's sqlite3) also handles much of DNS specific logic (e.g., it knows what's expected if it finds CNAME). This makes it harder to support other types of database backends; if we want to do the same thing, we need to implement many things in addition to data base specific code; on the other hand, there's no sample that is dumb (only cares about database specific things) and can be used as a template. My understanding of why the sqlite3 backend is so "intelligent" is that it was intended to be an example of a specialized (relatively) high performance backend. But, as we all know now, it's actually quite slow and we'll need other mechanisms such as hot spot caching to meet requirements from performance sensitive operators anyway. So, right now, we only have quite complicated but not sufficiently fast backend.
  • (Although this maybe fixable within the current design) it's not clear which level of validity of data/function parameters we can assume in specific part of class hierarchies. For example, it's not entirely clear whether we can assume a data source is used only for a single class, and since RRClass is passed to some Sqlite3DataSrc methods, we need to worry about the case of class mismatch or class ANY query handling at this level. It's also not clear who is responsible for validating data from the actual database backend. The current sqlite3 data source does it itself (perhaps implicitly, by converting string to various libdns++ objects), but if we also assume a more dumb data sources, there may have to be a middle layer to validate data passed from the lower level data source (that ultimately comes from the underlying data base). Such things are not clear enough.

Proposed new design

Design Goals

  • Keep the actual DB backend as simple (dumb) as possible, so that we can support other types of databases more easily.
  • Keep individual (C++, and also python as a result) classes simpler, focusing on a smaller, specific set of responsibilities. In particular, avoid mixing DNS-specific logic and database-specific logic in the same single class (as much as possible). This will help test and (hopefully) understand these classes.
  • Make them work for both in-memory and database-driven data sources. The internal implementation details of these would be very different, so reasonable abstraction at a higher level would be necessary.
  • (As a subgoal of the design) clarify the unclear points as stated in the third bullet of the "issue" section above.

(Some level of) Details

Introduce a new DataSourceClient class. This is a revised version of the current "DataSrc" class, but has more limited focus and delegates some specific responsibilities (especially operations on a specific zone) to other classes (described below). In general methods of DataSourceClient act as factories of these other classes. (It may eventually support the ability of adding/deleting zones, but in this initial design it doesn't.)

DataSourceClient is an abstract base class that only defines the interfaces. Specific derived classes correspond to particular types of data sources such as in-memory or database-driven ones.

We define a derived class DataBaseDataSourceClient to represent a data source client with a database backend (SQLite3, MySQL, PostgreSQL, LDAP, etc). A DataBaseDataSourceClient object contains an object of class DataBaseConnection. This is an abstract base class that represents a connection to a particular instance of a database server (or the library backend such as SQLite3).

The basic intent here is that DataBaseConnection is quite dumb, and only encapsulate straightforward mapping from text-based database request to actual database queries for a specific type of database; in particular, they are not supposed to have any DNS specific knowledge (even though the proposed method names indicate DNS related operations).

We also define a factory class (tentatively named DataSourceClientCreator) for DataSourceClient. We define a specific derived class of DataSourceClientCreator for each specific derived class of DataSourceClient (DataBaseDataSourceClientCreator for DataBaseDataSourceClient, etc) to create an instance of a specific derived class of DataSourceClient from a set of configuration parameters. This way top level applications such as b10-auth doesn't have to care about details of actual data source characteristics and can only rely on the abstract interfaces. (Note: a simpler factory function may suffice here).

If the application uses multiple threads, each thread will need to create and use a separate DataSourceClient. This is because some database backend doesn't allow multiple threads to share the same connection to the database.

We also assume an instance of DataSourceClient is expected to handle only a single RR class (even if the underlying database contains records for multiple RR classes). Likewise, (when we support view) a DataSourceClient client is expected to handle only a single view.

The UML(-like) diagram below summarizes the relationship between these classes.

An applications will first create a set of DataSourceClient objects based on local configuration that specifies the available types of data sources, and use their factory methods to create another class object for some specific type of operations within the data source.

The ZoneFinder class is one of these classes. A ZoneFinder object is created by DataSourceClient::findZone() and will be used to perform normal lookups (for a pair of name and RR type). ZoneFinder is an abstract base class, and its derived classes correspond to specific data sources. For data sources using data base backends, we define the DataBaseZoneFinder class. Like DataBaseDataSourceClient, an instance of this class contains a DataBaseConnection object, and uses the connection to perform specific lookups.

The DataBaseZoneFinder class acts as a middle layer between DNS related operations and operations on the data base backend: Its find() method converts a given domain name (of type isc::dns::Name) and RR type to textual string, and performs longest match lookups using the dumb, underlying data base via the DataBaseConnection object.

As a middle layer, the DataBaseZoneFinder class is also expected to perform validation on data given by the database backend. For example, it must expect a given domain name is not a valid one and handle error cases appropriately, so that the user classes can assume that if an operation on a DataBaseZoneFinder object succeeds the returned data (if any) is valid.

The ZoneIterator class is similar to ZoneFinder, but provides an interface to iterate over all RRs of the corresponding zone. It is specifically intended to be used by b10-xfrout (with other less major applications such as dumping zone content).

The basic class hierarchy for ZoneIterator is the same, and it's intended to be used to update the content of the zone, either as a whole (in the case of AXFR-in) or partially (for IXFR-in or dynamic update).

In the case of DataBaseZoneIterator, it will start a database transaction on construction, modify the zone via its (derived) getNextRRset() method, and (if everything is okay) confirms the change with its commit() method.

Other (relatively minor or possibly controversial) notes

  • DataSourceClient is intentionally named "client" to clarify that it's not the storage itself, but rather an access point to it (and there can possibly be multiple instances of the client class pointing to the same database). But probably that's largely a matter of taste.
  • The above description and diagram may suggest !DataBaseZoneXXX classes hold a separate new instance of DB connection, but that's not necessarily the intent. In fact, for ZoneFinder it would make more sense to share the same connection with its creator, DataSourceClient. On the other hand, ZoneIterator may have to have a new connection because otherwise the application cannot use any (even read) operations via the DataSourceClient while the iterator is working.
  • The scope of ZoneUpdater is intentionally limited, i.e, modifying the content of a zone, instead of a generic "transaction on data source". This is because the generic transaction is probably difficult to implement for the in-memory data source, and we want to provide unified interface for both in-memory and database driven data sources. In practice, necessary transactions for the DNS data sources should be limited, so it wouldn't be too restrictive (we can add other specific classes as we see other needs).
  • DataSourceClient::reopen() hasn't been given much thought yet, but it's intended to be used to reload a particular zone (probably incrementally), particularly intending for in-memory backend.
  • Likewise, there's not much consideration on how to implement hot spot cache (at least none in this document yet). But the cache will probably be held in (a specific derived instance of) DataSourceClient object.

Experiments

Branch trac817a contains a quickly-hacked, proof-of-concept implementation of the proposed ideas described above. The key points are:

  • Most of the classes described were implemented in a straight forward way (in some points it may be inefficient, and it largely lacks error handling and only contains few tests).
  • bin/auth/query.cc is used as the highest level of the query processing logic (that has been used solely for the in memory data source so far). Essentially it doesn't have to be modified at all; there has been only trivial class name changes.
  • to prove it will be easier to add support for new database backend, I added a quick-hack version of MySQL data source (client). The source file is attached: http://bind10.isc.org/attachment/wiki/DataSourceDesign/mysql_conn.cc While it's quick hack (e.g., lacks some error handling), it supports all of the basic query, iteration over a zone, and updates. I hope this is convincing enough to show it will be easy.
  • I've also conducted a simple performance check (even though the quick hack implementation wasn't intended to be efficient) to see the possible overhead of the additional class layers. I used bin/auth/benchmarks/query_bench with a (non signed) root server configuration and (a bit old but) real query sample to an F-root instance. The sample data contains 9541 queries.
    • 207.93qps: Current sqlite3 data source enabling Hot Spot Cache with unlimited slots
    • 208.42qps: Current sqlite3 data source enabling Hot Spot Cache 10*#queries slots
    • 205.66qps: Current sqlite3 data source enabling Hot Spot Cache with #queries/2 slots
    • 170.26qps: Current sqlite3 data source disabling Hot Spot Cache
    • 187.82qps: Newer sqlite3 data source (no cache yet)
    • 7.80qps: Newly added MySQL data source (no cache)
    While the comparison is not entirely fair (because the experimental version lacks some error handling, etc), I believe it's quite safe to say it should be able to run at least as fast the current version (the MySQL backend was remarkably slow, but in any event we've not had any MySQL data source yet)
Last modified 6 years ago Last modified on Jul 16, 2011, 12:20:56 AM

Attachments (8)

Download all attachments as: .zip