Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mydumper OOMs in deterministic fashion for large database #8718

Closed
timsehn opened this issue Jan 7, 2025 · 2 comments · Fixed by dolthub/go-mysql-server#2811
Closed

mydumper OOMs in deterministic fashion for large database #8718

timsehn opened this issue Jan 7, 2025 · 2 comments · Fixed by dolthub/go-mysql-server#2811
Labels
bug Something isn't working integrations Issues with tools connecting to/querying Dolt

Comments

@timsehn
Copy link
Contributor

timsehn commented Jan 7, 2025

Repro:

  1. Clone: https://www.dolthub.com/repositories/dolthub/transparency-in-pricing
  2. cd transparency-in-pricing
  3. Make this config.yaml
$ cat config.yaml 
log_level: info

behavior:
  read_only: false
  autocommit: true

user:
  name: root
  password: ""

listener:
  host: localhost
  port: 4000
  max_connections: 100
  read_timeout_millis: 28800000
  write_timeout_millis: 28800000
  tls_key: "/Users/timsehn/dolthub/git/dolt/go/libraries/doltcore/servercfg/testdata/chain_key.pem"	
  tls_cert: "/Users/timsehn/dolthub/git/dolt/go/libraries/doltcore/servercfg/testdata/chain_cert.pem"
  require_secure_transport: true
  1. $ dolt sql-server --config=config.yaml

  2. In another shell run: mydumper -h 127.0.0.1 -P 4000 -u root -p '' --protocol tcp --database transparency-in-pricing

I've run it three times and got the following errors at different positions:

** (mydumper:20378): CRITICAL **: 14:01:17.311: Thread 1: Could not read data from transparency-in-pricing.rate to write on export-20250106-133614/transparency-in-pricing.rate.00000.sql at byte 62970454016: TLS/SSL error: unexpected eof while reading
** (mydumper:83008): CRITICAL **: 15:45:33.797: Thread 3: Could not read data from transparency-in-pricing.rate to write on export-20250106-153337/transparency-in-pricing.rate.00000.sql at byte 36137476096: TLS/SSL error: unexpected eof while reading
** (mydumper:90304): CRITICAL **: 15:58:41.096: Thread 4: Could not read data from transparency-in-pricing.rate to write on export-20250106-154712/transparency-in-pricing.rate.00000.sql at byte 36305829888: TLS/SSL error: unexpected eof while reading

All different file positions. The server is also crashed:

INFO[0025] ConnectionClosed                              connectionID=4
INFO[0025] ConnectionClosed                              connectionID=5
INFO[0025] ConnectionClosed                              connectionID=3
zsh: killed     dolt sql-server --config=config.yaml
@timsehn timsehn added bug Something isn't working integrations Issues with tools connecting to/querying Dolt labels Jan 7, 2025
@timsehn
Copy link
Contributor Author

timsehn commented Jan 7, 2025

This is almost certainly an OOM error. We need to turn trace on and debug where the leak is.

I was watching my resource consumption. It stood pretty steady at about 55GB and then all of a sudden spiked up over 100GB or RAM used.

Here's a sample of the trace log before it died. Nothing strange. We must not be freeing memory when we should:

TRAC[10826] spooling result row [VARCHAR("210015") UINT64(14620200445531048052) NULL VARCHAR("COPANLISIB 1(60)MG/4ML INJ") NULL NULL NULL NULL NULL NULL VARCHAR("J9057") NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL ENUM("negotiated") VARCHAR("MedStar Carefirst PPO Negotiated Charge") NULL DECIMAL(124.22) NULL NULL NULL NULL]  connectTime="2025-01-06 16:05:19.340031 -0800 PST m=+5.388440792" connectionDb=transparency-in-pricing connectionID=2 query="SELECT /*!40001 SQL_NO_CACHE */ * FROM `transparency-in-pricing`.`rate` "
TRAC[10826] spooling result row [VARCHAR("210015") UINT64(14620205148627744493) NULL VARCHAR("REAMER CANN PROX 22MM") VARCHAR("0272") NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL ENUM("negotiated") VARCHAR("United Healthcare Medicare Advantage Negotiated Charge") NULL DECIMAL(1046.90) NULL NULL NULL NULL]  connectTime="2025-01-06 16:05:19.340031 -0800 PST m=+5.388440792" connectionDb=transparency-in-pricing connectionID=2 query="SELECT /*!40001 SQL_NO_CACHE */ * FROM `transparency-in-pricing`.`rate` "
TRAC[10826] spooling result row [VARCHAR("210015") UINT64(14620208572974097406) NULL VARCHAR("WRENCH COMBINATION 7MM") VARCHAR("0272") NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL ENUM("negotiated") VARCHAR("Amerihealth Alliance Negotiated Charge") NULL DECIMAL(927.66) NULL NULL NULL NULL]  connectTime="2025-01-06 16:05:19.340031 -0800 PST m=+5.388440792" connectionDb=transparency-in-pricing connectionID=2 query="SELECT /*!40001 SQL_NO_CACHE */ * FROM `transparency-in-pricing`.`rate` "
TRAC[10826] spooling result row [VARCHAR("210015") UINT64(14620213460660554911) NULL VARCHAR("STEM FEM POR REG 12.5X145MM") NULL NULL NULL NULL NULL NULL VARCHAR("C1776") NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL ENUM("negotiated") VARCHAR("Amerihealth Negotiated Charge") NULL DECIMAL(3013.68) NULL NULL NULL NULL]  connectTime="2025-01-06 16:05:19.340031 -0800 PST m=+5.388440792" connectionDb=transparency-in-pricing connectionID=2 query="SELECT /*!40001 SQL_NO_CACHE */ * FROM `transparency-in-pricing`.`rate` "
TRAC[10826] spooling result row [VARCHAR("210015") UINT64(14620213477318834680) NULL VARCHAR("DRL IJS-E CANND DIS CUT 2.7X70") VARCHAR("0272") NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL ENUM("negotiated") VARCHAR("Tricare for Life Negotiated Charge") NULL DECIMAL(1111.36) NULL NULL NULL NULL]  connectTime="2025-01-06 16:05:19.340031 -0800 PST m=+5.388440792" connectionDb=transparency-in-pricing connectionID=2 query="SELECT /*!40001 SQL_NO_CACHE */ * FROM `transparency-in-pricing`.`rate` "
TRAC[10826] spooling result row [VARCHAR("210015") UINT64(14620216121916708997) NULL VARCHAR("MATRIX STRIP CONDUCT 10CC") VARCHAR("0278") NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL ENUM("negotiated") VARCHAR("United Healthcare All Savers Health Plan Negotiated Charge") NULL DECIMAL(2867.97) NULL NULL NULL NULL]  connectTime="2025-01-06 16:05:19.340031 -0800 PST m=+5.388440792" connectionDb=transparency-in-pricing connectionID=2 query="SELECT /*!40001 SQL_NO_CACHE */ * FROM `transparency-in-pricing`.`rate` "
TRAC[10826] spooling result row [VARCHAR("210015") UINT64(14620216452006148517) NULL VARCHAR("NAIL HUM CANN TI STER 9X220MM") NULL NULL NULL NULL NULL NULL VARCHAR("C1713") NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL ENUM("negotiated") VARCHAR("Blue Cross Blue Shield National Account Service Company Negotiated Charge") NULL DECIMAL(2834.72) NULL NULL NULL NULL]  connectTime="2025-01-06 16:05:19.340031 -0800 PST m=+5.388440792" connectionDb=transparency-in-pricing connectionID=2 query="SELECT /*!40001 SQL_NO_CACHE */ * FROM `transparency-in-pricing`.`rate` "
TRAC[10826] spooling result row [VARCHAR("210015") UINT64(14620219700915300342) NULL VARCHAR("IMPLANT ARIA 0DE 18X40 8MM") NULL NULL NULL NULL NULL NULL VARCHAR("C1713") NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL ENUM("negotiated") VARCHAR("HSCSN Negotiated Charge") NULL DECIMAL(5389.79) NULL NULL NULL NULL]  connectTime="2025-01-06 16:05:19.340031 -0800 PST m=+5.388440792" connectionDb=transparency-in-pricing connectionID=2 query="SELECT /*!40001 SQL_NO_CACHE */ * FROM `transparency-in-pricing`.`rate` "
TRAC[10826] spooling result row [VARCHAR("210015") UINT64(14620221142958936535) NULL VARCHAR("NAIL TIB HOW T2 STD 11X300") NULL NULL NULL NULL NULL NULL VARCHAR("C1713") NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL ENUM("negotiated") VARCHAR("Kaiser Medicare Advantage Negotiated Charge") NULL DECIMAL(3226.20) NULL NULL NULL NULL]  connectTime="2025-01-06 16:05:19.340031 -0800 PST m=+5.388440792" connectionDb=transparency-in-pricing connectionID=2 query="SELECT /*!40001 SQL_NO_CACHE */ * FROM `transparency-in-pricing`.`rate` "
TRAC[10826] spooling result row [VARCHAR("210015") UINT64(14620226130529402637) NULL VARCHAR("DISC ALLOGRAFT PARTIAL 5X75MM") NULL NULL NULL NULL NULL NULL VARCHAR("C1713") NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL ENUM("negotiated") VARCHAR("United Healthcare Medicare Advantage Negotiated Charge") NULL DECIMAL(5159.54) NULL NULL NULL NULL]  connectTime="2025-01-06 16:05:19.340031 -0800 PST m=+5.388440792" connectionDb=transparency-in-pricing connectionID=2 query="SELECT /*!40001 SQL_NO_CACHE */ * FROM `transparency-in-pricing`.`rate` "
TRAC[10826] spooling result row [VARCHAR("210015") UINT64(14620226498604343355) NULL VARCHAR("STENT GFTMASTER REX 4.0X19MM") NULL NULL NULL NULL NULL NULL VARCHAR("C1874") NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL ENUM("negotiated") VARCHAR("Tricare for Life Negotiated Charge") NULL DECIMAL(3700.85) NULL NULL NULL NULL]  connectTime="2025-01-06 16:05:19.340031 -0800 PST m=+5.388440792" connectionDb=transparency-in-pricing connectionID=2 query="SELECT /*!40001 SQL_NO_CACHE */ * FROM `transparency-in-pricing`.`rate` "
TRAC[10826] spooling result row [VARCHAR("210015") UINT64(14620227971173907830) NULL VARCHAR("INSERT KNEE SZ6 12MM LT EMPWR") NULL NULL NULL NULL NULL NULL VARCHAR("C1776") NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL ENUM("negotiated") VARCHAR("ChoiceCare Negotiated Charge") NULL DECIMAL(3985.69) NULL NULL NULL NULL]  connectTime="2025-01-06 16:05:19.340031 -0800 PST m=+5.388440792" connectionDb=transparency-in-pricing connectionID=2 query="SELECT /*!40001 SQL_NO_CACHE */ * FROM `transparency-in-pricing`.`rate` "
TRAC[10826] spooling result row [VARCHAR("210015") UINT64(14620228003566537260) NULL VARCHAR("PEG SCR DEPU 2.5X28MM") NULL NULL NULL NULL NULL NULL VARCHAR("C1713") NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL ENUM("negotiated") VARCHAR("Cigna Point of Service  Negotiated Charge") NULL DECIMAL(241.87) NULL NULL NULL NULL]  connectTime="2025-01-06 16:05:19.340031 -0800 PST m=+5.388440792" connectionDb=transparency-in-pricing connectionID=2 query="SELECT /*!40001 SQL_NO_CACHE */ * FROM `transparency-in-pricing`.`rate` "
zsh: killed     dolt sql-server --config=config.yaml

@timsehn timsehn changed the title mydumper fails in deterministic fashion for large database mydumper OOMs in deterministic fashion for large database Jan 7, 2025
@jycor
Copy link
Contributor

jycor commented Jan 10, 2025

Large results sets were not being cleared from the buffer as those rows were being spooled. This PR fixes that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working integrations Issues with tools connecting to/querying Dolt
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants