#4272 closed enhancement (complete)

meta-information support in classification

Reported by: tomek Owned by: fdupont
Priority: low Milestone: Kea1.1
Component: classification Version: git
Keywords: Cc:
CVSS Scoring: Parent Tickets:
Sensitive: no Defect Severity: N/A
Sub-Project: DHCP Feature Depending on Ticket:
Estimated Difficulty: 11 Add Hours to Ticket: 3
Total Hours: 14 Internal?: no

Description

The ClientClassificationDesign calls for providing meta information available in the client classification expressions from a packet.

At least the following pieces of information should be accessible:

  • interface name
  • src/dst IP
  • packet length

Optionally we could also use source and destination UDP port.

During our 1.1 scoping discussions it was questioned whether this is medium or low priority. Set to low for now.

Subtickets

Change History (22)

comment:1 Changed 21 months ago by fdupont

I propose pkt.interface, pkt.source, pkt.destination, pkt.length. Are they too verbose? IMHO they are not.

comment:2 Changed 21 months ago by fdupont

  • Owner set to fdupont
  • Status changed from new to accepted

comment:3 Changed 21 months ago by fdupont

  • Owner changed from fdupont to UnAssigned
  • Status changed from accepted to assigned

Done. Ready for review. Note I missed the interface (I'll add it if the style is right) and not yet proposed a ChangeLog entry.

comment:4 Changed 21 months ago by fdupont

Added received interface name. For the ChangeLog entry I propose:
"Added the pkt classification token to extract incoming packet meta-data (receiving interface name, source and destination address, length). (Trac #4272, git ...)"
Please review!

comment:5 Changed 21 months ago by fdupont

  • Status changed from assigned to reviewing

comment:6 Changed 20 months ago by fdupont

  • Owner changed from UnAssigned to fdupont
  • Status changed from reviewing to accepted

Take it back because it should be rebased after #4268 and #4269 merges.

comment:7 Changed 18 months ago by fdupont

Addressing higher (medium > low) priority DHCP4o6 pending tickets first, then this one.

comment:8 follow-up: Changed 18 months ago by fdupont

  • Owner changed from fdupont to UnAssigned
  • Status changed from accepted to reviewing

Opened a new branch trac4272a and pushed 2 tags, trac4272a_base and trac4272a_reorg, the second has the doc fixed and the code reorganized following the order of token.h (no code line was changed: block move only).

Some questions:
there are 3 options:

  • pkt.<field>
  • overload pkt4.<filed> and pkt6.<field>
  • propose the 3: put{,4,6}.<field>

There are 2 related questions:

  • merge PKT4 and PKT6 messages into PKT (there is no real ambiguity), or at the opposite introduce a third EVAL_DEBUG_PKT?
  • cross misused are checked in the parser for relays (e.g., no relay4 in DHCPv6). IMHO this should be the same for pkt[46] but should we do the same for common meta-data (aka fields)? IMHO yes (in the parser is sooner so better and if we have pkt we provide a "neutral" way so it is not user unfriendly to catch cut & paste errors).

I put the ticket under review to get opinions (the code is easy as soon as there is an agreement about what it should do).

comment:9 Changed 18 months ago by fdupont

  • Total Hours changed from 0 to 2

comment:10 in reply to: ↑ 8 ; follow-up: Changed 17 months ago by tomek

  • Add Hours to Ticket changed from 0 to 1
  • Total Hours changed from 2 to 3

Replying to fdupont:

Some questions:
there are 3 options:

  • pkt.<field>
  • overload pkt4.<filed> and pkt6.<field>
  • propose the 3: put{,4,6}.<field>

If I understand the 3rd option, there's what you propose:

  • keep existing expressions (pkt4.mac, pkt.hlen, pkt6.msgtype, etc.)
  • add pkt.src, pkt.dst, pkt.len etc.

If that is so, I agree.

On a related note, you proposed in comment 1 the following:

pkt.interface, pkt.source, pkt.destination, pkt.length

I think we should stick to using abbreviated versions. We already use pkt (not packet), msgtype (not message-type) and transid (not transaction-id). My rationale for using compact notation is that the class expressions may be quite long and we should try to avoid wrapping them as that would make them difficult to read. So I think we should go with pkt.iface, pkt.src, pkt.dst and pkt.len.

There are 2 related questions:

  • merge PKT4 and PKT6 messages into PKT (there is no real ambiguity), or at the opposite introduce a third EVAL_DEBUG_PKT?

I think adding a EVAL_DEBUG_PKT is ok. We don't want to reuse the same log message in multiple places to get the unambiguous spot in the code where the message was produced.

  • cross misused are checked in the parser for relays (e.g., no relay4 in DHCPv6). IMHO this should be the same for pkt[46] but should we do the same for common meta-data (aka fields)? IMHO yes (in the parser is sooner so better and if we have pkt we provide a "neutral" way so it is not user unfriendly to catch cut & paste errors).

Since the decision seems to be to use pkt.* which is valid for both packet types, I don't think there's any check needed at all. But if you have a reason to check it, the parser is the best place for such checks.

I put the ticket under review to get opinions (the code is easy as soon as there is an agreement about what it should do).

Here are my opinions. Once you expect me to review the actual code, please tell me which branch I should review. I looked briefly at what's currently on branch trac4272 and some of the changes look odd: lots of tokens and fields disappeared from parser.yy and lexer.ll does not have addr4 and addr6 anymore. Surely that is a mistake or a work in progress, right?

comment:11 Changed 17 months ago by tomek

  • Owner changed from UnAssigned to fdupont

comment:12 in reply to: ↑ 10 Changed 17 months ago by fdupont

Replying to tomek:

Replying to fdupont:

Some questions:
there are 3 options:

  • pkt.<field>
  • overload pkt4.<filed> and pkt6.<field>
  • propose the 3: pkt{,4,6}.<field>

If I understand the 3rd option, there's what you propose:

=> no, pkt{,4,6} expands into pkt, pkt4 and pkt6 (vs pkt[46])

  • keep existing expressions (pkt4.mac, pkt.hlen, pkt6.msgtype, etc.)

=> in the 3 options it will be the case.

  • add pkt.src, pkt.dst, pkt.len etc.

=> this is the option 1 (simplest, this is why it is the first :-)

On a related note, you proposed in comment 1 the following:

pkt.interface, pkt.source, pkt.destination, pkt.length

I think we should stick to using abbreviated versions. We already use pkt (not packet), msgtype (not message-type) and transid (not transaction-id). My rationale for using compact notation is that the class expressions may be quite long and we should try to avoid wrapping them as that would make them difficult to read. So I think we should go with pkt.iface, pkt.src, pkt.dst and pkt.len.

=> no problem but please note these are keywords so shorter forms can more likely collide (a collision is not dramatic: a specific rule has to be added to solve it).

There are 2 related questions:

  • merge PKT4 and PKT6 messages into PKT (there is no real ambiguity), or at the opposite introduce a third EVAL_DEBUG_PKT?

I think adding a EVAL_DEBUG_PKT is ok. We don't want to reuse the same log message in multiple places to get the unambiguous spot in the code where the message was produced.

=> fine

  • cross misused are checked in the parser for relays (e.g., no relay4 in DHCPv6). IMHO this should be the same for pkt[46] but should we do the same for common meta-data (aka fields)? IMHO yes (in the parser is sooner so better and if we have pkt we provide a "neutral" way so it is not user unfriendly to catch cut & paste errors).

Since the decision seems to be to use pkt.* which is valid for both packet types, I don't think there's any check needed at all. But if you have a reason to check it, the parser is the best place for such checks.

=> fine (note this item has to be revisited if finally we decide for the third option).

I put the ticket under review to get opinions (the code is easy as soon as there is an agreement about what it should do).

Here are my opinions. Once you expect me to review the actual code, please tell me which branch I should review. I looked briefly at what's currently on branch trac4272 and some of the changes look odd: lots of tokens and fields disappeared from parser.yy and lexer.ll does not have addr4 and addr6 anymore. Surely that is a mistake or a work in progress, right?

=> the branch trac4272 is pretty old (it was done far before some recent classification stuff). Please forget it.

comment:13 Changed 17 months ago by fdupont

Working on it with option 1 & co. Branch is trac4272a.

comment:14 Changed 17 months ago by fdupont

  • Estimated Difficulty changed from 0 to 8
  • Total Hours changed from 3 to 4

comment:15 Changed 17 months ago by fdupont

  • Owner changed from fdupont to UnAssigned

Ready for review. BTW is there an easy way to check the new logging in classification code? (It is a more general issue, for instance I am not sure a log call with an incorrect number of arguments .arg(XXX) is detected)

comment:16 Changed 17 months ago by fdupont

  • Estimated Difficulty changed from 8 to 10
  • Total Hours changed from 4 to 7

Added some tests for a better coverage. Added logging test too (found the right tool in testutils by Shawn!)
Still under review (don't forget to regenerate parsers to run tests).

comment:17 Changed 17 months ago by tomek

  • Owner changed from UnAssigned to tomek

comment:18 follow-up: Changed 16 months ago by tomek

  • Add Hours to Ticket changed from 1 to 4
  • Owner changed from tomek to fdupont
  • Total Hours changed from 7 to 11

Replying to fdupont:

Ready for review. BTW is there an easy way to check the new logging in classification code? (It is a more general issue, for instance I am not sure a log call with an incorrect number of arguments .arg(XXX) is detected)

Yes, there is: configure --enable-logger-checks. It will throw if the number of
arguments passed don't match expectations. Normally compile unit-tests with
--enable-logger-checks, so the tests should fail if there's a discrepancy.

I did review the code from trac4272a.

To review the changes, I used:

git diff master...trac4272a

Why did you move TokenPkt6, TokenRelay6Option and TokenRelay6Field classes? It
takes extra time to compare that new text with the one being removed.

parser.yy
I saw you moved several tokens on the list. The comment says "tokens in an order
which makes sense and related to the intended use". I'm ok with moving the
tokens around, but I think you moved PEERADDR and LINKADDR unnecessarily. These
are properties of the RELAY6 token, no pkt.

token.cc
toHex method is very useful. It would be better to move it to util, so other
code could use it. Can you move it there?

TokenPkt::evaluate - I don't like the LEN case. In particular, the fact that the
conversion is done manually. We have functions like writeUint32, which should be
used. Please update the code to use them. If there is anything missing, please
add some extra code in lib/util and then use it.

The logging done in TokenPkt::evaluate could be done with a single LOG_DEBUG
call. The second parameter would be is_binary ? toHex(value) : value. It seems
much shorter.

token.h

The comment for TokenPkt? should mention that it is usable for both DHCPv4 and
DHCPv6 packets. The class comment says dst is 4 octets. Shouldn't that be 4 or
16?

context_unittest.cc
The testPktMetadata uses EvalContext? with V6 universe. Is there a
similar test that checks metadata for v4 packets?

token_unittest.cc
I pkt4MetaData you change the 10.0.0.2 address by comparing each byte
separately. Can you convert it to IOAddress first? It would be much more
readable.


Can you add an example config files in doc/examples/kea{4,6} that would show how
to use those new classiciations? Over time I would like to extend those samples
to cover other classification rules.


The following unit-tests failed on Ubuntu 16.04:

[ RUN      ] EvalContextTest.pktMetadataIface
context_unittest.cc:222: Failure
Failed
Exception thrown: <string>:1.1: Invalid character: p
[  FAILED  ] EvalContextTest.pktMetadataIface (0 ms)
[ RUN      ] EvalContextTest.pktMetadataSrc
context_unittest.cc:222: Failure
Failed
Exception thrown: <string>:1.1: Invalid character: p
[  FAILED  ] EvalContextTest.pktMetadataSrc (0 ms)
[ RUN      ] EvalContextTest.pktMetadataDst
context_unittest.cc:222: Failure
Failed
Exception thrown: <string>:1.1: Invalid character: p
[  FAILED  ] EvalContextTest.pktMetadataDst (0 ms)
[ RUN      ] EvalContextTest.pktMetadataLen
context_unittest.cc:222: Failure
Failed
Exception thrown: <string>:1.1: Invalid character: p
[  FAILED  ] EvalContextTest.pktMetadataLen (0 ms)

comment:19 in reply to: ↑ 18 ; follow-up: Changed 16 months ago by fdupont

  • Owner changed from fdupont to tomek

Replying to tomek:

Replying to fdupont:

Ready for review. BTW is there an easy way to check the new logging in classification code? (It is a more general issue, for instance I am not sure a log call with an incorrect number of arguments .arg(XXX) is detected)

Yes, there is: configure --enable-logger-checks. It will throw if the number of
arguments passed don't match expectations. Normally compile unit-tests with
--enable-logger-checks, so the tests should fail if there's a discrepancy.

I did review the code from trac4272a.

To review the changes, I used:

git diff master...trac4272a

Why did you move TokenPkt6, TokenRelay6Option and TokenRelay6Field classes? It
takes extra time to compare that new text with the one being removed.

=> I created two tags trac4272a_base and trac4272a_reorg to make your review easier...

parser.yy
I saw you moved several tokens on the list. The comment says "tokens in an order
which makes sense and related to the intended use". I'm ok with moving the
tokens around, but I think you moved PEERADDR and LINKADDR unnecessarily. These
are properties of the RELAY6 token, no pkt.

=> your argument makes sense: I moved them back.

token.cc
toHex method is very useful. It would be better to move it to util, so other
code could use it. Can you move it there?

=> moved to src/lib/eval/encode/hex.h

TokenPkt::evaluate - I don't like the LEN case. In particular, the fact that the
conversion is done manually. We have functions like writeUint32, which should be
used. Please update the code to use them. If there is anything missing, please
add some extra code in lib/util and then use it.

=> done.

The logging done in TokenPkt::evaluate could be done with a single LOG_DEBUG
call. The second parameter would be is_binary ? toHex(value) : value. It seems
much shorter.

=> done.

token.h

The comment for TokenPkt? should mention that it is usable for both DHCPv4 and
DHCPv6 packets. The class comment says dst is 4 octets. Shouldn't that be 4 or
16?

=> yes of course. Fixed.

context_unittest.cc
The testPktMetadata uses EvalContext? with V6 universe. Is there a
similar test that checks metadata for v4 packets?

=> context is about parsing so I can't see a real need for a v4 variant. BTW token has both v6 and v4 (and checks evaluation results too, not only parsing).

token_unittest.cc
I pkt4MetaData you change the 10.0.0.2 address by comparing each byte
separately. Can you convert it to IOAddress first? It would be much more
readable.

=> done.

Can you add an example config files in doc/examples/kea{4,6} that would show how
to use those new classiciations? Over time I would like to extend those samples
to cover other classification rules.

=> there is no classification at all in the examples. So either your idea is to begin with these rules or it should be done in its own ticket?

The following unit-tests failed on Ubuntu 16.04:

=> IMHO the lexer and parser were not regenerated. I'll push updated versions on the branch.

comment:20 in reply to: ↑ 19 ; follow-up: Changed 15 months ago by tomek

  • Add Hours to Ticket changed from 4 to 3
  • Owner changed from tomek to fdupont
  • Total Hours changed from 11 to 14

Replying to fdupont:

Replying to tomek:

Replying to fdupont:
token.cc
toHex method is very useful. It would be better to move it to util, so other
code could use it. Can you move it there?

=> moved to src/lib/eval/encode/hex.h

You mean src/lib/*util*/encode/hex.h, right? That's where I found the code. Thanks for doing that.

TokenPkt::evaluate - I don't like the LEN case. In particular, the fact that the
conversion is done manually. We have functions like writeUint32, which should be
used. Please update the code to use them. If there is anything missing, please
add some extra code in lib/util and then use it.

=> done.

Thanks. There are similar cases for Pkt6 and Pkt4, but I'll convert them while doing #4483.

Can you add an example config files in doc/examples/kea{4,6} that would show how
to use those new classiciations? Over time I would like to extend those samples
to cover other classification rules.

=> there is no classification at all in the examples. So either your idea is to begin with these rules or it should be done in its own ticket?

Yes, my idea was to start an example that documents only meta-data for now, but then require all future tickets to keep updating it. That way we would have at least some example. But that's ok to not have it now. I'll add it with #4483.

The following unit-tests failed on Ubuntu 16.04:

=> IMHO the lexer and parser were not regenerated. I'll push updated versions on the branch.

I have pulled your changed (trac4272a), recompiled (without enable-generate-parser) and got the following failures:

[ RUN      ] EvalContextTest.pktMetadataIface
context_unittest.cc:222: Failure
Failed
Exception thrown: <string>:1.1: Invalid character: p
[  FAILED  ] EvalContextTest.pktMetadataIface (0 ms)
[ RUN      ] EvalContextTest.pktMetadataSrc
context_unittest.cc:222: Failure
Failed
Exception thrown: <string>:1.1: Invalid character: p
[  FAILED  ] EvalContextTest.pktMetadataSrc (1 ms)
[ RUN      ] EvalContextTest.pktMetadataDst
context_unittest.cc:222: Failure
Failed
Exception thrown: <string>:1.1: Invalid character: p
[  FAILED  ] EvalContextTest.pktMetadataDst (0 ms)
[ RUN      ] EvalContextTest.pktMetadataLen
context_unittest.cc:222: Failure
Failed
Exception thrown: <string>:1.1: Invalid character: p
[  FAILED  ] EvalContextTest.pktMetadataLen (0 ms)

With --enable-generate-parser, it passed. Maybe you wanted me to review

Once you merge, make sure that you regenerate the parser files. Maybe you wanted me to review trac4272a_base and trac4272a_reorg branches instead? Anyway, all the changes on trac4272a are good and they address all my comments.

If there are other changes on trac427a_reorg that were not present on trac4272a, let me know. If not, go ahead and merge (just remember to rebuild parser files) once you do.

comment:21 in reply to: ↑ 20 Changed 15 months ago by fdupont

Replying to tomek:

The following unit-tests failed on Ubuntu 16.04:

=> IMHO the lexer and parser were not regenerated. I'll push updated versions on the branch.

=> I've just checked: the files were not pushed...

Once you merge, make sure that you regenerate the parser files. Maybe you wanted me to review trac4272a_base and trac4272a_reorg branches instead

=> no, base and reorg are intermediate tags.

I am merging trac4272a.

comment:22 Changed 15 months ago by fdupont

  • Estimated Difficulty changed from 10 to 11
  • Resolution set to complete
  • Status changed from reviewing to closed

Merged. lexer and parser files are up to date.

Note: See TracTickets for help on using tickets.