[FLINK-36517][cdc-connect][paimon] use filterAndCommit API for Avoid commit the same datafile duplicate #3639

beryllw · 2024-10-12T07:52:58Z

https://issues.apache.org/jira/browse/FLINK-35938 problem still persists.

storeMultiCommitter.commit API may cause the same datafile commit twice when job restart from failure.

beryllw · 2024-10-12T07:53:18Z

beryllw · 2024-10-12T07:57:05Z

...tor-paimon/src/main/java/org/apache/flink/cdc/connectors/paimon/sink/v2/PaimonCommitter.java

-                            "Commit succeeded for %s with %s committable",
-                            checkpointId, committables.size()));
-        } catch (Exception e) {
-            commitRequests.forEach(CommitRequest::retryLater);


Is there a specific purpose for retrying later in this context? @lvyanquan

beryllw · 2024-10-15T02:57:58Z

Could you please assist in reviewing this PR? Thank you. @lvyanquan

lvyanquan · 2024-10-15T08:01:57Z

I agree that the issue of duplicate commits still exists. Our testing in the case of abnormal failover is relatively lacking, can you try adding corresponding test case for this?

beryllw · 2024-10-15T10:26:53Z

I agree that the issue of duplicate commits still exists. Our testing in the case of abnormal failover is relatively lacking, can you try adding corresponding test case for this?

I will try, thanks.

…commit the same datafile duplicate

lvyanquan · 2024-10-17T09:45:06Z

...or-paimon/src/test/java/org/apache/flink/cdc/connectors/paimon/sink/v2/PaimonSinkITCase.java

+        // It's possible that flink job will restore from a checkpoint with only step#1 finished and
+        // step#2 not.
+        // CommitterOperator will try to re-commit recovered transactions.
+        committer.commit(commitRequests);


Thanks for adding this, what about running insert and commit many times(in a for loop), to simulate more complex situations and situations with compaction?

Considering there is another issue https://issues.apache.org/jira/browse/FLINK-36541 in PaimonWriter, If there is a problem with adding this loop, you can skip it for now.

github-actions bot added the paimon-pipeline-connector label Oct 12, 2024

beryllw commented Oct 12, 2024

View reviewed changes

beryllw force-pushed the flink-36517 branch from 5088c0c to 4a50a81 Compare October 17, 2024 03:10

[FLINK-36517][cdc-connect][paimon] use filterAndCommit API for Avoid …

0a7af33

…commit the same datafile duplicate

beryllw force-pushed the flink-36517 branch from 4a50a81 to 0a7af33 Compare October 17, 2024 03:12

lvyanquan reviewed Oct 17, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FLINK-36517][cdc-connect][paimon] use filterAndCommit API for Avoid commit the same datafile duplicate #3639

[FLINK-36517][cdc-connect][paimon] use filterAndCommit API for Avoid commit the same datafile duplicate #3639

beryllw commented Oct 12, 2024 •

edited

Loading

beryllw commented Oct 12, 2024 •

edited

Loading

beryllw Oct 12, 2024 •

edited

Loading

beryllw commented Oct 15, 2024

lvyanquan commented Oct 15, 2024 •

edited

Loading

beryllw commented Oct 15, 2024

lvyanquan Oct 17, 2024

lvyanquan Oct 17, 2024

[FLINK-36517][cdc-connect][paimon] use filterAndCommit API for Avoid commit the same datafile duplicate #3639

Are you sure you want to change the base?

[FLINK-36517][cdc-connect][paimon] use filterAndCommit API for Avoid commit the same datafile duplicate #3639

Conversation

beryllw commented Oct 12, 2024 • edited Loading

beryllw commented Oct 12, 2024 • edited Loading

beryllw Oct 12, 2024 • edited Loading

Choose a reason for hiding this comment

beryllw commented Oct 15, 2024

lvyanquan commented Oct 15, 2024 • edited Loading

beryllw commented Oct 15, 2024

lvyanquan Oct 17, 2024

Choose a reason for hiding this comment

lvyanquan Oct 17, 2024

Choose a reason for hiding this comment

beryllw commented Oct 12, 2024 •

edited

Loading

beryllw commented Oct 12, 2024 •

edited

Loading

beryllw Oct 12, 2024 •

edited

Loading

lvyanquan commented Oct 15, 2024 •

edited

Loading