Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Found extra duplicate rows after reorg partition #57510

Open
Defined2014 opened this issue Nov 19, 2024 · 3 comments · May be fixed by #57114
Open

Found extra duplicate rows after reorg partition #57510

Defined2014 opened this issue Nov 19, 2024 · 3 comments · May be fixed by #57114
Assignees
Labels
affects-7.5 affects-8.1 affects-8.5 component/tablepartition This issue is related to Table Partition of TiDB. severity/major type/bug The issue is confirmed as a bug.

Comments

@Defined2014
Copy link
Contributor

Bug Report

Please answer these questions before submitting your issue. Thanks!

1. Minimal reproduce step (Required)

Run SQLs below, it will report duplicate error during add index.

set -eu

DB="test"
TABLE="t"
TABLE_COUNT=16
ROW_COUNT=100
CONCURRENCY=8

TABLE_COLUMNS='c1 INT, c2 CHAR(255), c3 CHAR(255), c4 CHAR(255), c5 CHAR(255)'

run_sql() {
    mysql --host "127.0.0.1" --port "4000" -u "root" -D "test" -e "$1"
}

insertRecords() {
    for i in $(seq $2 $3); do
        run_sql "INSERT INTO $1 VALUES (\
            $i, \
            REPEAT(' ', 255), \
            REPEAT(' ', 255), \
            REPEAT(' ', 255), \
            REPEAT(' ', 255)\
        );"
    done
}

createTable() {
    run_sql "CREATE TABLE IF NOT EXISTS $DB.$TABLE$1 ($TABLE_COLUMNS) \
        PARTITION BY RANGE(c1) ( \
        PARTITION p0 VALUES LESS THAN (0), \
        PARTITION p1 VALUES LESS THAN ($(expr $ROW_COUNT / 2)));"
    run_sql "ALTER TABLE $DB.$TABLE$1 ADD PARTITION (PARTITION p2z VALUES LESS THAN (MAXVALUE));"
}

echo "load database $DB"
run_sql "CREATE DATABASE IF NOT EXISTS $DB;"
for i in $(seq $TABLE_COUNT); do
  createTable "${i}" &
done
wait


for i in $(seq $TABLE_COUNT); do
    for j in $(seq $CONCURRENCY); do
        insertRecords $DB.$TABLE${i} $(expr $ROW_COUNT / $CONCURRENCY \* $(expr $j - 1) + 1) $(expr $ROW_COUNT / $CONCURRENCY \* $j) &
    done
    if [ $((i % 4)) -eq 0 ]; then
            run_sql "ALTER TABLE $DB.$TABLE${i} REMOVE PARTITIONING"
    fi
    if [ $((i % 2)) -eq 0 ]; then
            run_sql "ALTER TABLE $DB.$TABLE${i} \
                PARTITION BY RANGE(c1) ( \
                PARTITION p0 VALUES LESS THAN (0), \
                PARTITION p1 VALUES LESS THAN ($(expr $ROW_COUNT / 3)), \
                PARTITION p2y VALUES LESS THAN ($(expr $ROW_COUNT \* 2 / 3)), \
                PARTITION p3 VALUES LESS THAN (MAXVALUE));"
    fi
done
wait

for i in $(seq $TABLE_COUNT); do
    run_sql "ALTER TABLE $DB.$TABLE${i} ADD UNIQUE INDEX idx(c1) GLOBAL" &
done
wait

2. What did you expect to see? (Required)

No error and all tables have 96 rows.

3. What did you see instead (Required)

Got error and some tables have more than 96 rows.

ERROR 1062 (23000) at line 1: Duplicate entry '10' for key 't16.idx'
ERROR 1062 (23000) at line 1: Duplicate entry '11' for key 't8.idx'

4. What is your TiDB version? (Required)

mysql> select tidb_version();
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| tidb_version()                                                                                                                                                                                                                                                          |
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Release Version: v8.5.0-alpha-93-ga4faee2ee5
Edition: Community
Git Commit Hash: a4faee2ee591446898c921f564aa0f1a14813371
Git Branch: master
UTC Build Time: 2024-11-12 07:37:18
GoVersion: go1.23.1
Race Enabled: false
Check Table Before Drop: false
Store: unistore |
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)
@Defined2014 Defined2014 added type/bug The issue is confirmed as a bug. severity/major component/tablepartition This issue is related to Table Partition of TiDB. labels Nov 19, 2024
@mjonss mjonss self-assigned this Nov 19, 2024
@mjonss
Copy link
Contributor

mjonss commented Nov 19, 2024

I think it might be the concurrent insert that is the problem, since if one does run_sql "select c1, count(1) from $DB.$TABLE${i} group by c1 having count(1) > 1" before the ADD INDEX, it will also find duplicates.

@mjonss
Copy link
Contributor

mjonss commented Nov 19, 2024

Fixed by #57114

@mjonss
Copy link
Contributor

mjonss commented Nov 19, 2024

This is a regression from #53770, and is fixed by #57114.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
affects-7.5 affects-8.1 affects-8.5 component/tablepartition This issue is related to Table Partition of TiDB. severity/major type/bug The issue is confirmed as a bug.
Projects
None yet
2 participants