Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sync-diff-inspector: Error "context deadline exceeded" #850

Open
mengxian-li opened this issue Feb 25, 2025 · 1 comment
Open

sync-diff-inspector: Error "context deadline exceeded" #850

mengxian-li opened this issue Feb 25, 2025 · 1 comment

Comments

@mengxian-li
Copy link

mengxian-li commented Feb 25, 2025

Bug Report

Please answer these questions before submitting your issue. Thanks!

We use sync-diff-inspector to compare the data of two Aurora clusters.
We have seen the following error in sync-diff-inspector when comparing table rows for large table, especially if there are massive mismatches.

[2025/02/23 23:21:32.208 +00:00] [ERROR] [report.go:404] ["Set table meet error"] [error="context deadline exceeded"] [stack="github.com/pingcap/tidb-tools/sync_diff_inspector/report.(*Report).SetTableMeetError\n\t/app/sync_diff_inspector/report/report.go:404\nmain.(*Diff).consume\n\t/app/sync_diff_inspector/diff.go:457\nmain.(*Diff).Equal.func2\n\t/app/sync_diff_inspector/diff.go:290\ngithub.com/pingcap/tidb-tools/sync_diff_inspector/utils.(*WorkerPool).Apply.func1\n\t/app/sync_diff_inspector/utils/utils.go:94"]

The context deadline error could happen in comparing checksum phase due to query taking too long.
We have tried increasing the query timeout and using larger instance types, which helped reduce the errors, but would not fix the essential problem.

We should consider improving the parallelism in comparing checksum and row data.

@joechenrh
Copy link
Contributor

joechenrh commented Feb 26, 2025

Hi, in sync-diff-inspector, we have already implemented concurrent data comparison. Here's an overview of the whole process:

  1. Chunk Division: Each table is divided into multiple chunks. If the table has any index, we will use the index column to split chunks. If the chunk size (chunk-size) is not explicitly specified in the configuration file, we will calcuate # of chunks as max(rowCount/10000, 10000). However, if the table doesn't has any index, we will treat the whole table as one chunk.
  2. Concurrent Chunk Checking: sync-diff-inspector will check all the chunks concurrently which can be specified by check-thread-count, which has a default value 4.
  3. Mismatch Data Check: like the first step, if the table has indices, sync-diff-inspector will utilize it to do binary search, otherwise, it will compare the data in each chunk row by row.

Could you provide the configuration you used and table schema if possible?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants