You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi all, hope everithing is fine.
Currently we are working on a feature that uses this connector to dump debezium CDC messages. Our deployed connector is currently working based on default values, and as far as I can get, the only ways to control flush intervals are based on time (offset.flush.interval.ms, which defaults to 60 secs) and num of records per flushed files (file.max.records, didn't see a default value, so I'm assuming it will write as many lines as possible to the file during the flush interval window).
Our current cost for writing these files is pretty large, so I was wondering if you'd have any bet practices to improve performance of the connectors in order to reduce network costs during the write to gcs.
Thanks for any help!
Cheers
The text was updated successfully, but these errors were encountered:
I am also struggling with flush intervals and file sizes. I want to try and only flush every ten minutes or when the heap is filled up. I currently have these settings:
But I'm still flushing every couple of minutes even though I have 32GB of heap... I don't get it.
I'm getting about 250K messages (77MB) compressed to 7MB.
Hi guys, I also met this issue.
With enlarging the offset.flush.interval.ms inside the worker.properties file, the sink speed becomes normal.
But it does affect a lot to other topics/connectors. Their speed also become slow.
So I wonder if there an optimal solution or a plan of improving this point. Thanks!
Hi all, hope everithing is fine.
Currently we are working on a feature that uses this connector to dump debezium CDC messages. Our deployed connector is currently working based on default values, and as far as I can get, the only ways to control flush intervals are based on time (offset.flush.interval.ms, which defaults to 60 secs) and num of records per flushed files (file.max.records, didn't see a default value, so I'm assuming it will write as many lines as possible to the file during the flush interval window).
Our current cost for writing these files is pretty large, so I was wondering if you'd have any bet practices to improve performance of the connectors in order to reduce network costs during the write to gcs.
Thanks for any help!
Cheers
The text was updated successfully, but these errors were encountered: