Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Sink supports LZ4 compression with json format #354

Merged
merged 6 commits into from
Apr 19, 2024

Conversation

banmoy
Copy link
Collaborator

@banmoy banmoy commented Apr 17, 2024

What type of PR is this:

  • BugFix
  • Feature
  • Enhancement
  • Refactor
  • UT
  • Doc
  • Tool

Which issues of this PR fixes :

After StarRocks supports lz4 compression for stream load json format in StarRocks/starrocks#43732, the connector can compress the json data before sending to StarRocks which will reduce the network traffic significantly. In the test to load clickbench data to starrocks, the compression ratio can be ~8, and the load performance has a 3.64% degradation which is acceptable.
You can enable it with the following configuration

'sink.properties.format' = 'json'
'sink.properties.compression' = 'lz4_frame'

Problem Summary(Required) :

Checklist:

  • I have added test cases for my bug fix or my new feature
  • This pr will affect users' behaviors
  • This pr needs user documentation (for new or modified features or behaviors)
  • I have added documentation for my new feature or new function

Signed-off-by: PengFei Li <[email protected]>
Signed-off-by: PengFei Li <[email protected]>
Signed-off-by: PengFei Li <[email protected]>
/** Compress data using LZ4_FRAME. */
public class LZ4FrameCompressionCodec implements CompressionCodec {

public static final String NAME = "LZ4_FRAME";
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is LZ4 compatible?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need streaming compression here, so use lz4_frame. And SR also does not support lz4, see StarRocks/starrocks#43732

@banmoy banmoy merged commit 5c2a334 into StarRocks:main Apr 19, 2024
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants