sync-diff-inspector: Error "context deadline exceeded" #850

mengxian-li · 2025-02-25T07:50:40Z

Bug Report

Please answer these questions before submitting your issue. Thanks!

We use sync-diff-inspector to compare the data of two Aurora clusters.
We have seen the following error in sync-diff-inspector when comparing table rows for large table, especially if there are massive mismatches.

[2025/02/23 23:21:32.208 +00:00] [ERROR] [report.go:404] ["Set table meet error"] [error="context deadline exceeded"] [stack="github.com/pingcap/tidb-tools/sync_diff_inspector/report.(*Report).SetTableMeetError\n\t/app/sync_diff_inspector/report/report.go:404\nmain.(*Diff).consume\n\t/app/sync_diff_inspector/diff.go:457\nmain.(*Diff).Equal.func2\n\t/app/sync_diff_inspector/diff.go:290\ngithub.com/pingcap/tidb-tools/sync_diff_inspector/utils.(*WorkerPool).Apply.func1\n\t/app/sync_diff_inspector/utils/utils.go:94"]

The context deadline error could happen in comparing checksum phase due to query taking too long.
We have tried increasing the query timeout and using larger instance types, which helped reduce the errors, but would not fix the essential problem.

We should consider improving the parallelism in comparing checksum and row data.

The text was updated successfully, but these errors were encountered:

joechenrh · 2025-02-26T05:59:05Z

Hi, in sync-diff-inspector, we have already implemented concurrent data comparison. Here's an overview of the whole process:

Chunk Division: Each table is divided into multiple chunks. If the table has any index, we will use the index column to split chunks. If the chunk size (chunk-size) is not explicitly specified in the configuration file, we will calcuate # of chunks as max(rowCount/10000, 10000). However, if the table doesn't has any index, we will treat the whole table as one chunk.
Concurrent Chunk Checking: sync-diff-inspector will check all the chunks concurrently which can be specified by check-thread-count, which has a default value 4.
Mismatch Data Check: like the first step, if the table has indices, sync-diff-inspector will utilize it to do binary search, otherwise, it will compare the data in each chunk row by row.

Could you provide the configuration you used and table schema if possible?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

sync-diff-inspector: Error "context deadline exceeded" #850

sync-diff-inspector: Error "context deadline exceeded" #850

mengxian-li commented Feb 25, 2025 •

edited

Loading

joechenrh commented Feb 26, 2025 •

edited

Loading

sync-diff-inspector: Error "context deadline exceeded" #850

sync-diff-inspector: Error "context deadline exceeded" #850

Comments

mengxian-li commented Feb 25, 2025 • edited Loading

Bug Report

joechenrh commented Feb 26, 2025 • edited Loading

mengxian-li commented Feb 25, 2025 •

edited

Loading

joechenrh commented Feb 26, 2025 •

edited

Loading