Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Producer: Avoid drop bytes on errors & TCP/UDP Sockets: Move flush to load #2244

Draft
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

AmmarAbouZor
Copy link
Member

@AmmarAbouZor AmmarAbouZor commented Feb 26, 2025

This PR extends #2233 and replaces #2240.
It marked is draft until #2233 is merged

It provides the following changes:

TCP/UDP Sources:

Flushing the internal buffer is moved to load() method to tie freeing bytes in internal buffer and loading it with more data, making it more reliable to avoid potential bugs when having an incomplete parser result on last packet before having the internal buffer almost filled.

Producer Loop:

  • Optimize handling parse error by avoiding dropping all loaded bytes. Instead, it will skip them one by one and reapply parse to avoid data loss.
  • Abort the session on initial parse errors to avoid calling parse repeatedly on each byte drop of the source in case the input isn't
    suitable for the selected parser.
  • Extend unit tests for producer with mocks to cover parse error cases

SomeIP parser:

Fix converting errors from SomeIP crate into parser, ensuring Incomplete error type is delivered correctly in producer loop since this error will be handled different than parse errors

Producer:
* Use internal buffers for the parsed items to reuse the memory avoiding
  allocating new memory on each call.
* Removed stream interface to avoid getting into multiple mutable
  reference for producer and its buffer at the same time.
* Fix all unit tests ensuring all of them are passing without any change
  in their core functionality.
* Remove index counter because it's not used anywhere.

All Parsers:
* Parse all provided bytes returning all available items until hitting
  the first error.
* Adjust unit tests on SomeIP ensuring the length of the return items is
  correct.
* Provide new test for string tokenizer to ensure bytes are parsed on
  the first call.

Chipmunk-CLI:
* Integrate changes in producer
* Ensure all external dependencies are used.

Indexer-CLI:
* Integrate changes in producer.

Benchmarks:
* Fix parser multiple time to return an iterator instead vector to avoid
  allocating memory.
Flushing the internal buffer on load is more reliable to avoid potential
bugs when having an incomplete parser result on last packet before having
the internal buffer almost filled.
* Change handling parse error from dropping all loaded bytes to skipping
  them one by one and reapplying parse on the remaining to avoid data
  loss.
* Finish the session on initial parse errors to avoid calling parse
  repeatedly on each byte drop of the source in case the input isn't
  suitable for the selected parser.
Extend unit tests for producer loop with mocks to cover possible
scenarios with parse errors
* Fix mapping error type from SomeIP to parser in case more data are
  needed since this error type is handheld differently in producer
  loop.
* Matching adjustments on unit, snapshot and integration tests.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant