Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Errors while trying to setup mysql replica #329

Open
aszenz opened this issue Dec 26, 2024 · 15 comments
Open

Errors while trying to setup mysql replica #329

aszenz opened this issue Dec 26, 2024 · 15 comments

Comments

@aszenz
Copy link

aszenz commented Dec 26, 2024

Great work on the the project, I'm trying to replicate a mysql db, but got these errors:

the table does not support check constraint operations: order_claims_orderlines
While executing DDL script for `central`.`messenger_messages`: duckdb: Sequence Error: currval: sequence is not yet defined in this session
@fanyang01
Copy link
Collaborator

Hi @aszenz,

The first error message appears to be related to issue #38; I'll investigate it shortly. The second error message seems to involve AUTO_INCREMENT. Could you please share details about the table structure of central.messenger_messages? Specifically, the lines related to AUTO_INCREMENT from the SHOW CREATE TABLE output would be very helpful.

@aszenz
Copy link
Author

aszenz commented Dec 26, 2024

Could you please share details about the table structure of central.messenger_messages? Specifically, the lines related to AUTO_INCREMENT from the SHOW CREATE TABLE output would be very helpful.

Sure:

CREATE TABLE `messenger_messages` (                                
   `id` bigint NOT NULL AUTO_INCREMENT,                             
   `body` longtext NOT NULL,                                        
   `headers` longtext NOT NULL,                                     
   `queue_name` varchar(190) NOT NULL,                              
   `created_at` datetime(6) NOT NULL,                               
   `available_at` datetime(6) NOT NULL,                             
   `delivered_at` datetime(6) DEFAULT NULL,                         
   PRIMARY KEY (`id`),                                              
   KEY `IDX_75EA56E0FB7336F0` (`queue_name`),                       
   KEY `IDX_75EA56E0E3BD61CE` (`available_at`),                     
   KEY `IDX_75EA56E016BA31DB` (`delivered_at`)                      
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci

This error also occurs for some other tables which use int as primary key auto_increment

@fanyang01
Copy link
Collaborator

Thank you so much! I’ve addressed the issues with auto-increment columns and check constraints in PRs #333 and #335. The reported errors should now be resolved, though I’m not entirely certain. We would greatly appreciate it if you could try the latest Docker image again and let us know if you encounter any further issues.

@aszenz
Copy link
Author

aszenz commented Dec 28, 2024

We would greatly appreciate it if you could try the latest Docker image again and let us know if you encounter any further issues.

Thanks for the quick fix, just tried with the latest image, it now runs further than before but ends up with these errors:

docker run \
       --network qt_default \
       -p 127.0.0.1:13306:3306 \
       --env=SETUP_MODE=REPLICA \
       --env=SOURCE_DSN="mysql://root@mysql2:3306/demo" \
       apecloud/myduckserver:latest
Unable to find image 'apecloud/myduckserver:latest' locally
latest: Pulling from apecloud/myduckserver
fd674058ff8f: Pull complete 
0765b1d0d4fe: Pull complete 
979e44dddad9: Pull complete 
b24722419621: Pull complete 
c46692f011d8: Pull complete 
d1c965e99ff1: Pull complete 
149c09a4fd78: Pull complete 
4f4fb700ef54: Pull complete 
c19d8de169ab: Pull complete 
2b3d71e2e617: Pull complete 
082505547a7e: Pull complete 
5876914e276c: Pull complete 
2dd9d274b116: Pull complete 
Digest: sha256:22c0c373985b46346c5809da41915639cdd40789579dc5d2d88befa5c3da4921
Status: Downloaded newer image for apecloud/myduckserver:latest
Starting MyDuck Server in REPLICA mode...
SOURCE_TYPE=MYSQL
SOURCE_USER=root
SOURCE_PASSWORD=
SOURCE_HOST=mysql2
SOURCE_PORT=3306
SOURCE_DATABASE=demo
Waiting for MyDuck Server at 127.0.0.1:3306 to be ready...
time="2024-12-28T10:01:17Z" level=info msg="Server ready. Accepting connections."
time="2024-12-28T10:01:17Z" level=warning msg="secure_file_priv is set to \"\", which is insecure."
time="2024-12-28T10:01:17Z" level=warning msg="Any user with GRANT FILE privileges will be able to read any file which the sql-server process can read."
time="2024-12-28T10:01:17Z" level=warning msg="Please consider restarting the server with secure_file_priv set to a safe (or non-existent) directory."
time="2024-12-28T10:01:17Z" level=info msg=NewConnection DisableClientMultiStatements=false connectionID=1
time="2024-12-28T10:01:17Z" level=info msg=ConnectionClosed connectionID=1
MyDuck Server is ready!
Replicating MySQL primary server: DSN=mysql://root@mysql2:3306/demo ...
mysqlsh is already installed.
Checking if replication has already been started...
Cannot set LC_ALL to locale en_US.UTF-8: No such file or directory
time="2024-12-28T10:01:18Z" level=info msg=NewConnection DisableClientMultiStatements=false connectionID=2
WARNING: Using a password on the command line interface can be insecure.
time="2024-12-28T10:01:18Z" level=info msg=ConnectionClosed connectionID=2
Checking MySQL configuration...
Checking MySQL server parameters...
Cannot set LC_ALL to locale en_US.UTF-8: No such file or directory
WARNING: Using a password on the command line interface can be insecure.
GTID_MODE: OFF
MySQL server parameters are correctly configured.
Checking privileges for the current user 'root'...
Cannot set LC_ALL to locale en_US.UTF-8: No such file or directory
WARNING: Using a password on the command line interface can be insecure.
Current user 'root' has all required privileges.
Preparing MyDuck Server for replication...
Setting local_infile and server_id...
Cannot set LC_ALL to locale en_US.UTF-8: No such file or directory
time="2024-12-28T10:01:21Z" level=info msg=NewConnection DisableClientMultiStatements=false connectionID=3
WARNING: Using a password on the command line interface can be insecure.
time="2024-12-28T10:01:21Z" level=info msg=ConnectionClosed connectionID=3
Checking if source server supports MySQL Shell...
Cannot set LC_ALL to locale en_US.UTF-8: No such file or directory
WARNING: Using a password on the command line interface can be insecure.
Copying a snapshot of the MySQL instance to MyDuck Server...
Detected core count: 8
Thread count set to: 16
Copying data from MySQL to MyDuck...
Cannot set LC_ALL to locale en_US.UTF-8: No such file or directory
WARNING: Using a password on the command line interface can be insecure.
time="2024-12-28T10:01:23Z" level=info msg=NewConnection DisableClientMultiStatements=false connectionID=4
Copying DDL and Data from in-memory FS, source: adad4e57605f:3306, target: :3306.
SRC: Initializing...
SRC: Initializing - done
SRC: Gathering information...
SRC: 2 out of 6 schemas will be dumped and within them 251 tables, 0 views.
SRC: Gathering information - done
WARNING: SRC: The dumped value of gtid_executed is not guaranteed to be consistent
SRC: Writing global DDL files
SRC: Writing schema metadata...
SRC: Writing DDL...
SRC: Writing table metadata...
SRC: Running data dump using 16 threads.
SRC: Dumping data...
TGT: Opening dump...
TGT: Opening dump - done
NOTE: TGT: Dump is still ongoing, data will be loaded as it becomes available.
time="2024-12-28T10:01:25Z" level=info msg=NewConnection DisableClientMultiStatements=false connectionID=6
time="2024-12-28T10:01:25Z" level=info msg=NewConnection DisableClientMultiStatements=false connectionID=5
time="2024-12-28T10:01:25Z" level=info msg=NewConnection DisableClientMultiStatements=false connectionID=8
time="2024-12-28T10:01:25Z" level=info msg=NewConnection DisableClientMultiStatements=false connectionID=7
time="2024-12-28T10:01:25Z" level=info msg=NewConnection DisableClientMultiStatements=false connectionID=10
TGT: Target is MySQL 8.0.23. Dump was produced from MySQL 8.4.3
time="2024-12-28T10:01:25Z" level=info msg=NewConnection DisableClientMultiStatements=false connectionID=13
time="2024-12-28T10:01:25Z" level=info msg=NewConnection DisableClientMultiStatements=false connectionID=12
time="2024-12-28T10:01:25Z" level=info msg=NewConnection DisableClientMultiStatements=false connectionID=15
time="2024-12-28T10:01:25Z" level=info msg=NewConnection DisableClientMultiStatements=false connectionID=11
time="2024-12-28T10:01:25Z" level=info msg=NewConnection DisableClientMultiStatements=false connectionID=9
time="2024-12-28T10:01:25Z" level=info msg=NewConnection DisableClientMultiStatements=false connectionID=17
time="2024-12-28T10:01:25Z" level=info msg=NewConnection DisableClientMultiStatements=false connectionID=14
time="2024-12-28T10:01:25Z" level=info msg=NewConnection DisableClientMultiStatements=false connectionID=16
time="2024-12-28T10:01:25Z" level=info msg=NewConnection DisableClientMultiStatements=false connectionID=21
time="2024-12-28T10:01:25Z" level=info msg=NewConnection DisableClientMultiStatements=false connectionID=19
time="2024-12-28T10:01:25Z" level=info msg=NewConnection DisableClientMultiStatements=false connectionID=22
time="2024-12-28T10:01:25Z" level=info msg=NewConnection DisableClientMultiStatements=false connectionID=20
time="2024-12-28T10:01:25Z" level=info msg=NewConnection DisableClientMultiStatements=false connectionID=18
TGT: Scanning metadata...
TGT: Scanning metadata - done
TGT: Waiting for metadata to become available...
NOTE: SRC: Table statistics not available for `central`.`tenant_configuration_options`, chunking operation may be not optimal. Please consider running 'ANALYZE TABLE `central`.`tenant_configuration_options`;' first.
NOTE: SRC: Table statistics not available for `demo`.`accounting_setups`, chunking operation may be not optimal. Please consider running 'ANALYZE TABLE `demo`.`accounting_setups`;' first.
NOTE: SRC: Table statistics not available for `demo`.`datumbeperking`, chunking operation may be not optimal. Please consider running 'ANALYZE TABLE `demo`.`datumbeperking`;' first.
NOTE: SRC: Table statistics not available for `demo`.`artikelgroep`, chunking operation may be not optimal. Please consider running 'ANALYZE TABLE `demo`.`artikelgroep`;' first.
NOTE: SRC: Table statistics not available for `demo`.`accounting_setup_configurations`, chunking operation may be not optimal. Please consider running 'ANALYZE TABLE `demo`.`accounting_setup_configurations`;' first.
NOTE: SRC: Table statistics not available for `demo`.`accounts_subsidiaries`, chunking operation may be not optimal. Please consider running 'ANALYZE TABLE `demo`.`accounts_subsidiaries`;' first.
NOTE: SRC: Table statistics not available for `demo`.`classification`, chunking operation may be not optimal. Please consider running 'ANALYZE TABLE `demo`.`classification`;' first.
NOTE: SRC: Table statistics not available for `demo`.`account`, chunking operation may be not optimal. Please consider running 'ANALYZE TABLE `demo`.`account`;' first.
SRC: Writing schema metadata - done
SRC: Writing DDL - done
SRC: Writing table metadata - done
SRC: Starting data dump
NOTE: SRC: Table statistics not available for `demo`.`user_timezones`, chunking operation may be not optimal. Please consider running 'ANALYZE TABLE `demo`.`user_timezones`;' first.
NOTE: SRC: Table statistics not available for `demo`.`standaardwaarde`, chunking operation may be not optimal. Please consider running 'ANALYZE TABLE `demo`.`standaardwaarde`;' first.
NOTE: SRC: Table statistics not available for `demo`.`log`, chunking operation may be not optimal. Please consider running 'ANALYZE TABLE `demo`.`log`;' first.
NOTE: SRC: Table statistics not available for `demo`.`news`, chunking operation may be not optimal. Please consider running 'ANALYZE TABLE `demo`.`news`;' first.
TGT: Scanning metadata...
TGT: Scanning metadata - done
TGT: Checking for pre-existing objects...
TGT: Checking for pre-existing objects - done
TGT: Executing common preamble SQL...
TGT: Executing common preamble SQL - done
TGT: Executing DDL...
TGT: Executing DDL - done
TGT: Executing view DDL...
TGT: Executing view DDL - done
TGT: Loading data...
TGT: Starting data load
time="2024-12-28T10:02:13Z" level=info msg="Creating temp table central_marketplace_archives_12" connectTime="2024-12-28 10:01:25.678704517 +0000 UTC m=+9.269289515" connectionDb=central connectionID=12 db=central table=marketplace_archives
ERROR: TGT: [Worker001]: central@marketplace_archives@@0.tsv: MySQL Error 1105 (HY000): Binder Error: There are no UNIQUE/PRIMARY KEY Indexes that refer to this table, ON CONFLICT is a no-op: LOAD DATA LOCAL INFILE 'memory/central@marketplace_archives@@0.tsv' REPLACE INTO TABLE `central`.`marketplace_archives` CHARACTER SET 'utf8mb4' FIELDS TERMINATED BY '	' ESCAPED BY '\\' LINES STARTING BY '' TERMINATED BY '\n' (@`id`, `archived_for_company`, `archived_date`, `archived_by_user`, @`marketplace_id`) SET `id` = FROM_BASE64(@`id`),`marketplace_id` = FROM_BASE64(@`marketplace_id`)
ERROR: TGT: Aborting load...
time="2024-12-28T10:02:13Z" level=info msg="Creating temp table central_marketplace_status_11" connectTime="2024-12-28 10:01:25.690606024 +0000 UTC m=+9.281191032" connectionDb=central connectionID=11 db=central table=marketplace_status
time="2024-12-28T10:02:13Z" level=info msg="Creating temp table demo_outgoing_product_certificates_8" connectTime="2024-12-28 10:01:25.643999571 +0000 UTC m=+9.234584593" connectionDb=demo connectionID=8 db=demo table=outgoing_product_certificates
time="2024-12-28T10:02:13Z" level=info msg=ConnectionClosed connectionID=12
ERROR: TGT: [Worker000]: central@marketplace_status@@0.tsv: MySQL Error 1105 (HY000): Binder Error: There are no UNIQUE/PRIMARY KEY Indexes that refer to this table, ON CONFLICT is a no-op: LOAD DATA LOCAL INFILE 'memory/central@marketplace_status@@0.tsv' REPLACE INTO TABLE `central`.`marketplace_status` CHARACTER SET 'utf8mb4' FIELDS TERMINATED BY '	' ESCAPED BY '\\' LINES STARTING BY '' TERMINATED BY '\n' (@`id`, `seen_for_company`, `seen_date`, `seen_by_user`, @`marketplace_id`) SET `id` = FROM_BASE64(@`id`),`marketplace_id` = FROM_BASE64(@`marketplace_id`)
time="2024-12-28T10:02:13Z" level=warning msg="error running query" connectTime="2024-12-28 10:01:25.671708543 +0000 UTC m=+9.262293546" connectionDb=central connectionID=10 error="Conversion Error: CSV Error on Line: 163\nOriginal Line: _ORDER_LOADINGDATE_RESTRICTION\tmessages\t{\"de\": \"Das Ladedatum darf nicht kleiner oder gleich dem eingeschränkten Ladedatensatz im Bildschirm \\\\\"Verschiedenes\\\\\" sein\", \"en\": \"Loading date cannot be less than or equal to the restricted loading date set in root data / date limitation screen\", \"es\": \"Loading Date cannot be less than or equal to the restricted loading date set in miscelaneous screen\", \"fr\": \"Loading Date cannot be less than or equal to the restricted loading date set in miscelaneous screen\", \"nl\": \"Laaddatum mag niet kleiner zijn dan of gelijk zijn aan de beperkte laadgegevensset in het scherm stamgegevens\"}\nError when converting column \"translations\". Could not convert string \"{\"de\": \"Das Ladedatum darf nicht kleiner oder gleich dem eingeschränkten Ladedatensatz im Bildschirm \\\\\"Verschiedenes\\\\\" sein\", \"en\": \"Loading date cannot be less than or equal to the restricted loading date set in root data / date limitation screen\", \"es\": \"Loading Date cannot be less than or equal to the restricted loading date set in miscelaneous screen\", \"fr\": \"Loading Date cannot be less than or equal to the restricted loading date set in miscelaneous screen\", \"nl\": \"Laaddatum mag niet kleiner zijn dan of gelijk zijn aan de beperkte laadgegevensset in het scherm stamgegevens\"}\" to 'JSON'\n\nColumn translations is being converted as type VARCHAR\nThis type was auto-detected from the CSV file.\nPossible solutions:\n* Override the type for this column manually by setting the type explicitly, e.g. types={'translations': 'VARCHAR'}\n* Set the sample size to a larger value to enable the auto-detection to scan more values, e.g. sample_size=-1\n* Use a COPY statement to automatically derive types from an existing table.\n\n  file = pipes/load-data/10.pipe\n  delimiter = \t (Set By User)\n  quote = \\0 (Set By User)\n  escape = \\0 (Set By User)\n  new_line = \\n (Set By User)\n  header = false (Set By User)\n  skip_rows = 0 (Auto-Detected)\n  comment = \\0 (Auto-Detected)\n  date_format =  (Auto-Detected)\n  timestamp_format =  (Auto-Detected)\n  null_padding = 1\n  sample_size = 20480\n  ignore_errors = false\n  all_varchar = 0\n\n"
ERROR: TGT: [Worker005]: demo@outgoing_product_certificates@@0.tsv: MySQL Error 1105 (HY000): Binder Error: There are no UNIQUE/PRIMARY KEY Indexes that refer to this table, ON CONFLICT is a no-op: LOAD DATA LOCAL INFILE 'memory/demo@outgoing_product_certificates@@0.tsv' REPLACE INTO TABLE `demo`.`outgoing_product_certificates` CHARACTER SET 'utf8mb4' FIELDS TERMINATED BY '	' ESCAPED BY '\\' LINES STARTING BY '' TERMINATED BY '\n' (@`id`, `uitslag_id`, `quantity`, `pos_number`, `last_supplier`, `origin`, @`feedstock_id`, `country_id`, `produced_location_id`, `gram_co2_eec`, `gram_co2_ep`, `gram_co2_etd`, `energy_content`) SET `feedstock_id` = FROM_BASE64(@`feedstock_id`),`id` = FROM_BASE64(@`id`)
ERROR: TGT: [Worker009]: central@translations@@0.tsv: MySQL Error 1105 (HY000): Conversion Error: CSV Error on Line: 163
Original Line: _ORDER_LOADINGDATE_RESTRICTION	messages	{"de": "Das Ladedatum darf nicht kleiner oder gleich dem eingeschränkten Ladedatensatz im Bildschirm \\"Verschiedenes\\" sein", "en": "Loading date cannot be less than or equal to the restricted loading date set in root data / date limitation screen", "es": "Loading Date cannot be less than or equal to the restricted loading date set in miscelaneous screen", "fr": "Loading Date cannot be less than or equal to t: LOAD DATA LOCAL INFILE 'memory/central@translations@@0.tsv' REPLACE INTO TABLE `central`.`translations` CHARACTER SET 'utf8mb4' FIELDS TERMINATED BY '	' ESCAPED BY '\\' LINES STARTING BY '' TERMINATED BY '\n' (`code`, `domain`, `translations`)
time="2024-12-28T10:02:13Z" level=info msg=ConnectionClosed connectionID=8
time="2024-12-28T10:02:13Z" level=info msg=ConnectionClosed connectionID=10
time="2024-12-28T10:02:13Z" level=info msg=ConnectionClosed connectionID=11
time="2024-12-28T10:02:13Z" level=info msg="Creating temp table central_tenant_sessions_22" connectTime="2024-12-28 10:01:25.679284885 +0000 UTC m=+9.269869888" connectionDb=central connectionID=22 db=central table=tenant_sessions
time="2024-12-28T10:02:13Z" level=info msg="Creating temp table central_marketplace_21" connectTime="2024-12-28 10:01:25.685221079 +0000 UTC m=+9.275806084" connectionDb=central connectionID=21 db=central table=marketplace
ERROR: TGT: [Worker012]: central@tenant_sessions@@0.tsv: MySQL Error 1105 (HY000): Binder Error: There are no UNIQUE/PRIMARY KEY Indexes that refer to this table, ON CONFLICT is a no-op: LOAD DATA LOCAL INFILE 'memory/central@tenant_sessions@@0.tsv' REPLACE INTO TABLE `central`.`tenant_sessions` CHARACTER SET 'utf8mb4' FIELDS TERMINATED BY '	' ESCAPED BY '\\' LINES STARTING BY '' TERMINATED BY '\n' (@`uuid`, `tenant_name`, `created_at`, `usage_in_seconds`, `details`) SET `uuid` = FROM_BASE64(@`uuid`)
ERROR: TGT: [Worker011]: central@marketplace@@0.tsv: MySQL Error 1105 (HY000): Binder Error: There are no UNIQUE/PRIMARY KEY Indexes that refer to this table, ON CONFLICT is a no-op: LOAD DATA LOCAL INFILE 'memory/central@marketplace@@0.tsv' REPLACE INTO TABLE `central`.`marketplace` CHARACTER SET 'utf8mb4' FIELDS TERMINATED BY '	' ESCAPED BY '\\' LINES STARTING BY '' TERMINATED BY '\n' (@`id`, `subsidiary`, `advertisement_title`, `created_at`, `last_updated_at`, `advertise_details`, `company`, `posted_by`, `last_updated_by`, `advertisement_type`, `phone`, `email`) SET `id` = FROM_BASE64(@`id`)
time="2024-12-28T10:02:13Z" level=info msg="Creating temp table central_preview_template_print_preferences_13" connectTime="2024-12-28 10:01:25.681757484 +0000 UTC m=+9.272342488" connectionDb=central connectionID=13 db=central table=preview_template_print_preferences
time="2024-12-28T10:02:13Z" level=info msg="Creating temp table demo_transport_shipping_contract_numbers_7" connectTime="2024-12-28 10:01:25.66564644 +0000 UTC m=+9.256231441" connectionDb=demo connectionID=7 db=demo table=transport_shipping_contract_numbers
time="2024-12-28T10:02:13Z" level=info msg=ConnectionClosed connectionID=21
time="2024-12-28T10:02:13Z" level=info msg=ConnectionClosed connectionID=22
ERROR: TGT: [Worker008]: demo@transport_shipping_contract_numbers@@0.tsv: MySQL Error 1105 (HY000): Binder Error: There are no UNIQUE/PRIMARY KEY Indexes that refer to this table, ON CONFLICT is a no-op: LOAD DATA LOCAL INFILE 'memory/demo@transport_shipping_contract_numbers@@0.tsv' REPLACE INTO TABLE `demo`.`transport_shipping_contract_numbers` CHARACTER SET 'utf8mb4' FIELDS TERMINATED BY '	' ESCAPED BY '\\' LINES STARTING BY '' TERMINATED BY '\n' (@`id`, `transporter_relation_id`, `shipping_line`, `service_contract_number`) SET `id` = FROM_BASE64(@`id`)
ERROR: TGT: [Worker007]: central@preview_template_print_preferences@@0.tsv: MySQL Error 1105 (HY000): Binder Error: There are no UNIQUE/PRIMARY KEY Indexes that refer to this table, ON CONFLICT is a no-op: LOAD DATA LOCAL INFILE 'memory/central@preview_template_print_preferences@@0.tsv' REPLACE INTO TABLE `central`.`preview_template_print_preferences` CHARACTER SET 'utf8mb4' FIELDS TERMINATED BY '	' ESCAPED BY '\\' LINES STARTING BY '' TERMINATED BY '\n' (@`id`, @`preview_template_id`, `printer_name`, `paper_type`, `duplex_type`, `copies`, `collate_used`) SET `id` = FROM_BASE64(@`id`),`preview_template_id` = FROM_BASE64(@`preview_template_id`)
time="2024-12-28T10:02:13Z" level=info msg=ConnectionClosed connectionID=7
time="2024-12-28T10:02:13Z" level=info msg=ConnectionClosed connectionID=13
TGT: Loading data - done
TGT: No data loaded.
TGT: 253 DDL files were executed in 26 sec.
TGT: Total duration: 47 sec
TGT: 8 errors and 0 warnings were reported during the load.
time="2024-12-28T10:02:13Z" level=info msg=ConnectionClosed connectionID=18
SRC: Dumping data - done
time="2024-12-28T10:02:13Z" level=info msg=ConnectionClosed connectionID=14
time="2024-12-28T10:02:13Z" level=info msg=ConnectionClosed connectionID=16
time="2024-12-28T10:02:13Z" level=info msg=ConnectionClosed connectionID=19
time="2024-12-28T10:02:13Z" level=info msg=ConnectionClosed connectionID=17
time="2024-12-28T10:02:13Z" level=info msg=ConnectionClosed connectionID=5
time="2024-12-28T10:02:13Z" level=info msg=ConnectionClosed connectionID=9
time="2024-12-28T10:02:13Z" level=info msg=ConnectionClosed connectionID=15
time="2024-12-28T10:02:13Z" level=info msg=ConnectionClosed connectionID=6
time="2024-12-28T10:02:13Z" level=info msg=ConnectionClosed connectionID=20
time="2024-12-28T10:02:13Z" level=info msg=ConnectionClosed connectionID=4
ERROR: MYSQLSH 53005: Error loading dump
BINLOG_FILE or BINLOG_POS is empty, exiting.

For the first issue I found a related issue in duckdb:

duckdb/duckdb-postgres#278

Second issue regarding csv error seems like column type auto-inference is wrong while loading the dump, maybe we can use sample_size=-1

A strange thing I noticed is that mysql db server has two db's (central and demo) but in the SOURCE_DSN I only specified demo still myduckserver tries to replicate both databases.

@fanyang01
Copy link
Collaborator

Hi @aszenz,

Apologies for the delayed reply—I've been traveling.

I’ve addressed the "Binder Error: There are no UNIQUE/PRIMARY KEY Indexes that refer to this table" issue and added support for database/table replication filters in PR #336.

I also tried to find a workaround for replicating your central.translations table. However, this table cannot currently be read by either the CSV reader in DuckDB v1.1 (fast path) or the go-mysql-server framework (slow path). The problematic row is:

_ORDER_LOADINGDATE_RESTRICTION	messages	{"de": "Das Ladedatum darf nicht kleiner oder gleich dem eingeschränkten Ladedatensatz im Bildschirm \\"Verschiedenes\\" sein", "en": "Loading date cannot be less than or equal to the restricted loading date set in root data / date limitation screen", "es": "Loading Date cannot be less than or equal to the restricted loading date set in miscelaneous screen", "fr": "Loading Date cannot be less than or equal to the restricted loading date set in miscelaneous screen", "nl": "Laaddatum mag niet kleiner zijn dan of gelijk zijn aan de beperkte laadgegevensset in het scherm stamgegevens"}

This row contains a JSON field with a string: {"de": "...\\"Verschiedenes\\"..."}. Normally, the double backslashes should be single ones. However, MySQL and MySQL Shell escape single backslashes into doubles by default during export. Parsing such unquoted escaped strings is not supported by DuckDB v1.1.

Fortunately, I’ve contributed a fix for this to DuckDB (duckdb/duckdb#14464), and it will be available in the upcoming DuckDB v1.2 release, scheduled for 2025-01-13. I’ve confirmed that the latest nightly build of DuckDB can read such fields correctly:

➜ duckdb-nightly  
v1.1.4-dev3741 ab8c909857  
Enter ".help" for usage hints.  
Connected to a transient in-memory database.  
Use ".open FILENAME" to reopen on a persistent database.  

D .mode line  
D from read_csv('testdata/issue329.tsv', header = false, columns = {'code': 'VARCHAR', 'domain': 'VARCHAR', 'translations': 'JSON'}, sep = '\t', escape = '\');  

        code = _ORDER_LOADINGDATE_RESTRICTION  
      domain = messages  
translations = {"de": "Das Ladedatum darf nicht kleiner oder gleich dem eingeschränkten Ladedatensatz im Bildschirm \"Verschiedenes\" sein", "en": "Loading date cannot be less than or equal to the restricted loading date set in root data / date limitation screen", "es": "Loading Date cannot be less than or equal to the restricted loading date set in miscelaneous screen", "fr": "Loading Date cannot be less than or equal to the restricted loading date set in miscelaneous screen", "nl": "Laaddatum mag niet kleiner zijn dan of gelijk zijn aan de beperkte laadgegevensset in het scherm stamgegevens"}  

To work around this before the new release, I’ve implemented database and table replication filtering. You can now specify ?skip-tables=central.translations in the DSN to exclude this table from replication. Feel free to give it a try!

A strange thing I noticed is that the MySQL database server has two databases (central and demo), but in the SOURCE_DSN, I only specified demo. Still, MyDuckServer tries to replicate both databases.

Thank you for pointing this out! With the changes in PR #336, the database name specified in the DSN is now automatically added to the database filter. As a result, only data from this database (and any other databases explicitly specified via ?schemas=xxx,yyy) will be copied and replicated.

Let me know if you have any further questions or issues!

@aszenz
Copy link
Author

aszenz commented Jan 5, 2025

@fanyang01

Great work on fixing the previous issues so quickly, I now manage to get the central db replica up and running after ignoring translations table but for the demo db I get new errors:

time="2025-01-05T21:00:24Z" level=warning msg="error running query" connectTime="2025-01-05 21:00:01.644917556 +0000 UTC m=+7.451557494" connectionDb=demo connectionID=16 error="Parser Error: syntax error at or near \"phprapporten\""
ERROR: TGT: [Worker014]: demo@[email protected]: MySQL Error 1105 (HY000): Parser Error: syntax error at or near "phprapporten": LOAD DATA LOCAL INFILE 'memory/demo@[email protected]' REPLACE INTO TABLE `demo`.`peildatum` CHARACTER SET 'utf8mb4' FIELDS TERMINATED BY '	' ESCAPED BY '\\' LINES STARTING BY '' TERMINATED BY '\n' (`datum`, `vanaf`, `tot`, `doel`)

The schema for this table peildatum:

CREATE TABLE `peildatum` (                                                                                                  
  `datum` date DEFAULT NULL,                                                                                                
  `vanaf` date DEFAULT NULL,                                                                                                
  `tot` date DEFAULT NULL,                                                                                                  
  `doel` enum('phprapporten','excelrapporten','opslagkosten') CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci 

It seems to fail here at the enum value phprapporten for the doel column

Similar errors occur for other tables with enum columns.

Note that the table peildatum contains no records.

@fanyang01
Copy link
Collaborator

Thank you so much for your continued testing and verification of our bug fixes! I'm glad to hear you were able to get the central db replica up and running after excluding the translations table.

Regarding the new errors with the demo db, the issue appears to stem from the fact that duckdb does not accept columns={..., doel: ENUM('xxx', 'yyy', 'zzz')} as a type hint for its read_csv function. The problem has been identified and addressed in #348. Updating to the latest Docker image should resolve the issue.

We appreciate your detailed reports — they help us improve quickly! Let me know if you encounter anything else or need further assistance.

@aszenz
Copy link
Author

aszenz commented Jan 6, 2025

Thanks for the update, I'm facing another enum related issue:

ERROR: TGT: [Worker013]: demo@werkmaatschappij@@0.tsv: MySQL Error 1105 (HY000): Conversion Error: Unimplemented type for cast (BIGINT -> ENUM('cbs_idep_nl', 'destatis_idev_de')): LOAD DATA LOCAL INFILE 'memory/demo@werkmaatschappij@@0.tsv' REPLACE INTO TABLE `demo`.`werkmaatschappij` CHARACTER SET 'utf8mb4' FIELDS TERMINATED BY '	' ESCAPED BY '\\' LINES STARTING BY '' TERMINATED BY '\n' (`code`, `naam`, `adres`, `postcode`, `woonplaats`, `land`, `kvknummer`, `kvkplaats`, `btwnummer`, `prioriteit`, `administratie`, `accountingcurrency_id`, `icra_key`, @`accountingLinkDetails`, `accountingsetup`, `abbr_name`, `abbr_name_files`, `website`, `DunsNumber`, `statisticalexporttype`, `custom_fields`, `inline_attachments`, `documents`, `email_signature`, `automatically_send_to_accounting_software`, `base_url`) SET `accountingLinkDetails` = FROM_BASE64(@`accountingLinkDetails`)
ERROR: TGT: Aborting load...

@fanyang01
Copy link
Collaborator

I have successfully reproduced and resolved the new ENUM issue in #350. Thank you for reporting it!

@aszenz
Copy link
Author

aszenz commented Jan 12, 2025

I have successfully reproduced and resolved the new ENUM issue in #350. Thank you for reporting it!

Great now the replica is up and running at least on the demo db.

I'm still exploring working with myduck, so far I see that views/functions are not replicated, and when i try to recreate them I see syntax errors. My understanding was it supports full mysql syntax, so not sure what's the issue.

This is what happens when i connect to the replica via mycli and try to create a view:

CREATE OR REPLACE SQL SECURITY INVOKER VIEW log_report_count AS (
    SELECT
        report_code,
        report_name,
        report_count
    FROM report_logs
    WHERE report_count > 0
);

I get this:

(1105, "syntax error at position 190 near '0'")

Running the query directly works. Do i need to use duckdb syntax for creating views/functions?

Views are quite important for me since the whole idea of trying out myduck is that I can shift running heavy views to the replica without changing their definitions and get a perf lift.

I also faced some errors related to not using mysql strict mode, unfortunately mysql is quite accepting like accepting empty string for enums, not requiring full group by etc. Not sure myduck wants to support these, which in my experience are quite common in mysql installations.

A related question:

  • Is it possible to issue duckdb syntax queries over mysql connection or do i need to connect via postgres connection for duckdb queries as indicated in the docs

@aszenz
Copy link
Author

aszenz commented Jan 12, 2025

EDIT:

so far I see that views/functions are not replicated

Seems like demo db didn't have any views/functions when I try a real_db with views and functions, i get this error on starting replica:

TGT: Executing DDL...
time="2025-01-12T12:10:49Z" level=warning msg="error running query" connectTime="2025-01-12 12:10:26.709684742 +0000 UTC m=+13.783456517" connectionDb=real_db connectionID=12 error="syntax error at position 23 near 'FUNCTION'"
ERROR: TGT: While processing schema `real_db`: MySQL Error 1105 (HY000): syntax error at position 23 near 'FUNCTION': /*!50003 DROP FUNCTION IF EXISTS `toDate` */
ERROR: TGT: Error processing schema `real_db`: syntax error at position 23 near 'FUNCTION'
ERROR: TGT: [Worker007]: While executing DDL for schema real_db: syntax error at position 23 near 'FUNCTION'
ERROR: TGT: Aborting load...

Here's the output from show create function toDate:

| Function | sql_mode               | Create Function                                            | character_set_client | collation_connection | Database Collation |
|----------|------------------------|------------------------------------------------------------|----------------------|----------------------|--------------------|
| toDate   | NO_ENGINE_SUBSTITUTION | CREATE DEFINER=`root`@`%` FUNCTION `toDate`() RETURNS date | utf8mb4              | utf8mb4_0900_ai_ci   | utf8mb4_0900_ai_ci |
|          |                        |     DETERMINISTIC                                          |                      |                      |                    |
|          |                        |     SQL SECURITY INVOKER                                   |                      |                      |                    |
|          |                        | return IFNULL(@todate, '2999-12-31')                       |                      |                      |                    |

@fanyang01
Copy link
Collaborator

Creating views via a MySQL connection should be supported, as we have test cases for it. Let me investigate what might be going wrong.

Stored routines/functions are not currently considered due to the lack of support for them in DuckDB. I will verify if they are implemented in the go-mysql-server framework. If they are, it should be feasible to implement them; otherwise, it may be challenging due to the complexity involved.

Is it possible to issue DuckDB syntax queries over a MySQL connection?

This is not supported at the moment. I recommend sending raw DuckDB queries via a PostgreSQL connection for now. However, this is a feature I have considered and would like to implement if it is a must-have for you. It will take some time, but it would be a valuable addition.

I also faced some errors related to not using MySQL strict mode.

Yes, non-strict SQL modes are still quite common in real-world systems. We can report this issue to SQLGlot to see if it is possible to resolve it there.

@aszenz
Copy link
Author

aszenz commented Jan 12, 2025

Stored routines/functions are not currently considered due to the lack of support for them in DuckDB. I will verify if they are implemented in the go-mysql-server framework. If they are, it should be feasible to implement them; otherwise, it may be challenging due to the complexity involved.

If routines/procedures/functions are not supported, then myduck shouldn't try to replicate them, of course some views may reference these functions and would fail to be created as well.

@aszenz
Copy link
Author

aszenz commented Jan 12, 2025

This is not supported at the moment. I recommend sending raw DuckDB queries via a PostgreSQL connection for now. However, this is a feature I have considered and would like to implement if it is a must-have for you. It will take some time, but it would be a valuable addition

This feature is not important for me, just nice to have, primary importance is to get views working, these views have queries with many joins and aggregations often written in non-strict sql (maybe we can use duckdb's GROUP BY ALL feature to support them).

@aszenz
Copy link
Author

aszenz commented Jan 12, 2025

Creating views via a MySQL connection should be supported, as we have test cases for it. Let me investigate what might be going wrong.

Okay looks like it works without brackets surrounding the view:

Works

CREATE OR REPLACE SQL SECURITY INVOKER VIEW log_report_count AS 
    SELECT
        report_code,
        report_name,
        report_count
    FROM report_logs
    WHERE report_count > 0

But this doesn't

CREATE OR REPLACE SQL SECURITY INVOKER VIEW log_report_count AS (
 -- same query
)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants