wiki:ClientClassificationRequirements

Client Classification Requirements

This page defines requirements for client classification. For a discussion about client classification, see ClientClassificationDiscussion. For a design, see ClientClassificationDesign.

THIS IS WORK IN PROGRESS, please send your comments to kea-dev.

Intro & Scope

Our implementation assumes that incoming packets can belong to zero or more classes (see Pkt::inClass(string) and Pkt::addClass(string)). In that sense classes work more like tags than categories. There are operational reasons to have a packet assigned to multiple classes. For example an incoming packet can belong to CABLE (this is a cable modem), DOCSIS3.0 (it supports docsis3.0), ARRIS (it is produced by a given vendor), PREMIUM (this particular subscriber signed up for a premium plan) and PAYMENTS_OVERDUE (but unfortunately forgot to pay his bills) classes.

Client classification can be a very complex feature. Due to the feature being large we decided to split the client classification into phases. Phase 1 is defined as an absolute minimum required to be able to handle PXE boot clients in its simple form. That effectively means that we need to support option.value == "string" type of expression for a very limited number of options, with substring operator. This capability was implemented in 1.0 release. Phase 2, planned for Kea 1.1, will significantly expand the number of allowed expressions and their complexity. At the same time, it will expand the number of options that are properly supported. There are couple features that were not in scope for phase 2, and we would like to implement it one day. They're designated phase 3. There are currently no specific plans to implement them. However, when ISC mades the decision, the design for those features will be available.

Client classification is a compliacated feature, so it was split into several phases. Phase 1 was implemented in Kea 1.0. Phase 2 is planned for 1.1. Phase 3 is currently not assigned to any release.

This design currently focuses on phases 1 and 2, but sometimes covers features related to phase 3. Features that are outside of scope for phase 1 or 2 are clearly marked as (phase 3).

Example Use cases

These are the use cases we came up with. The list is not exhaustive, but should cover majority of the cases.

Use case 1: Ability to match certain substring. In this particular case, we want to pick up power supply produced by a vendor APC: (phase 1)

if substring(option vendor-class-identifier, 0, 3) = "APC";

Use case 2: Ability to pick specific architecture in PXEboot. (phase 1)

if the vendor class says the machine is a BIOS based one (arch_type 0x0000 in vendor class), 
    then send pxelinux.0 (or iPXE undionly.kpxe, if we want ipxe in that group)

if the vendor class says the machine is UEFI (arch_type 0x0007), 
    then send bootx64.efi (or iPXE snponly.efi, if we want ipxe in that group)

Use case 3: Ability to segregate clients based on their user class, especially for iPXE (phase 1)

if the user_class (DHO_USER_CLASS/D60_USER_CLASS) option is set to iPXE,
    then send the url for the ipxe configuration (details in http://ipxe.org/cfg/user-class)

Use case 4: Ability to make decisions based on RADIUS info. Note that it does not require for the server to interact with the Radius server. It is the relay agent that communicates and puts Radius options in RAI (v4) or in separate encapsulation level (v6) (phase 2)

rai.radius_option => (use content of that option as class name) (rai = relay agent information option)

Use case 5: Determine if the device is a cable modem or not (phase 2)

if chaddr == rai.docsis-option.suboption1234

Use case 6: Evaluate expression into a name of the class, rather than a boolean yes/no expression (phase 2).

extract USER-CLASS option and use its value as the class name.

Use case 7: Evaluate content of the option and use it as a class name. There's specific use case for it. You could classify clients based by their vendor-id: group cable modems from Arris to a one class, those produced by NetGear? to another etc. The nice thing about this approach is that you define a single class expression and it can generate many classes. Additional benefit is that if there's a new device in your network (let's say a first Linksys, it will be classified to its own class automatically, without any configuration update needed).

"VENDOR_" + option[vendor-options].enterprise-id

Open Questions

This section is temporary and will be removed once the discussion on kea-dev list is concluded.

The following features are under consideration. Please post your comments to kea-dev mailing list:

Q1: We do have boolean (and, or, not) operators designed and planned for 1.1. Do we also want arithmetic comparison (>, <, >=, <=) operators? Couple possible use cases that come to mind may be expressions like:

pkt4.secs > 10 (if client is trying to make an exchange for at least 10 seconds, treat him in some other way, maybe log a warning that the client either didn't get our responses or rejected them for some reason)

pkt4.hlen < 6 (it may be fishy if the client's reported hardware address is shorter than 6)

pkt4.hops > 1 (if the packet pretends to have traversed more relays than there are in your network, you know something is wrong here)

I admit that those are not super popular use-cases, but they may have some merit. On the other hand, implementing arithmetic would require doing conversion to

Q2: Do we want find(string, what) operator? I think those two can be a pretty powerful tools. With find(), you could search for specific strings in vendor class. For example I saw docsis, docsis3 and docsis3.0, but I presume there may be also docis2 out there and possibly even docsis1. Also, there's such a thing as eurodocsis, which doesn't have 'docsis' at the beginning of its content. Another possible use case would be to search for dots in the content of sent client FQDN option.

Q3: Do we want len(string) operator? This is a string operator that returns length of the string, which likely be different than the actual option, e.g. for 2001:db8::1 address it would return 11, while the actual option would be 16 bytes long. It could be useful for doing checks for various options, e.g. if text representation of an address is too short, e.g. if the text representation of an IPv6 address is less than 7 bytes (1234::1), it's likely bogus. You may want to treat differently clients with too short or too long fqdns.

Q4: Do we want option[123].len specifier? It would return the binary, on-wire length of the option. I don't have any specific use cases right now in mind, but I presume it may be useful for determining, whether an option client had sent is empty or not, if he sent a single or multiple addresses etc.

Q5: Do we want to have an ability to specify that if packet is matched to a class, the packet is immediately dropped? This is actually pretty simple to implement, but it's a radical expansion of the functionality. It would allow us to turn client classification into DHCP-specific firewall.

Requirements

General requirements:

  • G.1. Comparison of an option with a known value MUST be possible (Phase 1).
  • G.2. Comparison of a substring of an option with a known value MUST be possible (phase 1).
  • G.3. Comparison of option value with other options MUST be possible (Phase 2).
  • G.4. DHCP Options returned to the client are the combination of global options, subnet specific options, those

specific to the classes that the client belongs to and host options. See section Option Assignment Order below for details.

  • G.5. The syntax must be an expression, not fixed "option", operator, "value" as it is not flexible enough. The expression may contain option names, option codes, string and logical operators. The expression must evaluate to a single string, where "true" indicates that the packet belongs to a class.
  • G.6. Classes MUST be defined globally. This is required, so the use case we already have (segregating cable modems and devices behind them into separate subnets) continues to be supported.
  • G.7. The expression evaluation code MUST be implemented as a stand-alone library with minimal dependencies. The new library will depend on libdhcp++ (because it must be able to extract data from packets), libexceptions, libutil and possibly liblog (to be determined). This will be particularly useful when we decide to implement a client or a somewhat smart relay.
  • G.8. The classification code MUST use its own dedicated logger. It will be very useful for debugging purposes.
  • G.9. The system MUST support an arbitrary number of class definitions. Obviously, with each new class defined there is an extra additional burden. We may consider printing a warning if the configuration contains more than a certain arbitrary chosen number (maybe over 5?).
  • G.10. For a given subnet, it MUST be possible for clients belonging to a given class to be assigned extra options (i.e. clients in subnet X that also belong to class Y will get extra options A,B and C).
  • G.11. For a given subnet, it MUST be possible to allow only clients that belong to a given class (white-list) (i.e. clients that do not belong to class X will not be allowed in subnet Y).
  • G.12. If there is an option defined in a subnet and another one in the class (both with the same code), the server MUST send the one from client class. Example: all clients from subnet X should get router 192.0.2.1 option. However, clients from subnet X that also belong to class PREMIUM should get router 192.0.2.100 option.

Expressions:

  • E.1. Extracting option values from DHCPv4 options: Vendor Class Identifier (60), V-I Vendor Class (124), V-I Vendor-Specific Information (125) MUST be supported. (phase 1) Option value means "full option payalod as an input for expression". Individual option fields will not be accessible in phase 1.
  • E.2. Extracting option values from DHCPv6 options: user-class (16), vendor-class (17) MUST be supported. (phase 1) Option value means "full option payalod as an input for expression". Individual option fields will not be accessible in phase 1.
  • E.3. Constant expression specified as string (e.g. "MSFT") MUST be supported. (phase 1)
  • E.4. Constant expression specified as hexstring (e.g. 0x54484f4d) MAY be supported. (phase 1)
  • E.5. Operator == (equal) MUST be supported. (phase 1)
  • E.6. Operator substring specifying min-max range (e.g. substring(0,2)) MUST be supported. (phase 1)

New requirements for phase 2 (Kea 1.1):

  • E.7. Operators ! (logical negation), && (logical and), | | (logical or) MUST be supported. (phase 2)
  • E.8. Operators grouping ( ) (parentheses) MUST be supported. (phase 2)
  • E.9. Extracting the most useful constant field values from DHCPv4 packet MUST be supported. In particular, chaddr, giaddr, ciaddr, yiaddr, siaddr, hlen, htype MUST be supported. (phase 2)
  • E.10. Extracting constant field values from DHCPv6 packet MUST be supported. In particular, packet type MUST be supported. (phase 2)
  • E.11. Accessing options inserted by a v4 relay agent (suboptions in option 82 (RAI)) MUST be supported. (phase 2).
  • E.12. Accessing options inserted by a v6 relay agent (options stored by a v6 relay agent in each encapsulation level) MUST be supported. (phase 2).
  • E.13. Operator + (string concatenation) MUST be supported (phase 2).
  • E.14. Textual representation MUST be supported, e.g. for an option containing FQDN, it must be possible to write foo.example.org, rather than \3foo\7example\3org\0. Other example: must be possible to write 2001:db8::1 rather than 20010db8000000000000000000000001. This should be accessed as option[123].txt (phase 2).
  • E.15. For selected option formats access to their fields SHOULD be supported, e.g. option[vendor-id].enterprise-id (phase 2)
  • E.16. Access to suboptions in vendor-independent vendor-specific information (125), relay agent information (82), vendor-specific information (43) DHCPv4 options MUST be supported.
  • E.17. Access to nested options in vendor-idependent vendor-specific information DHCPv6 option (17) MUST be supported. This is required for accessing docsis options in DHCPv6.
  • E.18. Ability to extract meta information SHOULD be supported. In particular, network interface name, source and destination IP address and packet length should be supported.

Phase 3:

  • X.1. Ability to use evaluated expression as a class name (e.g. "VENDOR_" + option[17].enterprise-id resulting in class VENDOR_1234) SHOULD be supported.
  • X.2. Ability to specify class options on a pool level SHOULD be supported. (TBD)

Options assignment order

With client classification implemented, there will be multiple places where options could be defined. Here is the order in which options will be assigned. Each "level" takes precedence over previous ones.

  1. global (implemented in prehistory, Kea 0.8 or earlier)
  2. global class (implemented in phase 1/Kea 1.0)
  3. subnet (implemented in phehistory, Kea 0.8 or earlier)
  4. subnet class (to be implemented in Kea 1.1)
  5. host (see #3571, #3572, #3573) (planned for Kea 1.1)

For example, let's consider a case that the DNS server option is specified on the global level as 1.1.1.1, on subnet level as 2.2.2.2 and third time for a class FOO as 3.3.3.3. The client connected to the subnet and belonging to class FOO will get 3.3.3.3. Another client connected to the same subnet that does not belong to the foo class, will get 2.2.2.2.

For clients that belong to multiple classes, they'll get the options for each class. For duplicates, only one instance will be used. It depends on the implementation, but it's likely it will be influenced by the class definition order in the configuration file.

Last modified 22 months ago Last modified on Jan 19, 2016, 6:50:41 PM