求助:2.0.x使用routine load时,muti-table的task不正常结束 #24958
PliskinZhang
started this conversation in
General
Replies: 2 comments 1 reply
-
@PliskinZhang Please refer to #25056 |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
2.0.0和2.0.1我都试了,当routine load是简单单表模式时,运行正常。
但如果是muti-table,则对应的task没有finish日志,然后新的task又在submit进来,导致很快线程耗尽,报too many tasks.
甚至一开始数据多少还是有点写入的,后面就干脆不写入了。
当我pause掉job后,从日志其他正常job还是可以看出pool被占用着,不释放。
muti-table routine load task not finish yet and new task submit. current tasks num be bigger and log too many tasks finally.
normal routine load task log:
I0927 14:12:39.462114 351714 routine_load_task_executor.cpp:267] submit a new routine load task: id=5f31ac24344e4c47-abfd3f3efe9b2c58, job_id=160686, txn_id=82139, label=health-160686-5f31ac24344e4c47-abfd3f3efe9b2c58-82139, elapse(s)=0, current tasks num: 2
I0927 14:12:39.462155 159997 routine_load_task_executor.cpp:285] begin to execute routine load task: id=5f31ac24344e4c47-abfd3f3efe9b2c58, job_id=160686, txn_id=82139, label=health-160686-5f31ac24344e4c47-abfd3f3efe9b2c58-82139, elapse(s)=0
I0927 14:12:39.462463 159997 stream_load_executor.cpp:71] begin to execute job. label=health-160686-5f31ac24344e4c47-abfd3f3efe9b2c58-82139, txn_id=82139, query_id=5f31ac24344e4c47-abfd3f3efe9b2c58
I0927 14:12:39.462525 159997 fragment_mgr.cpp:689] query_id: 5f31ac24344e4c47-abfd3f3efe9b2c58 coord_addr TNetworkAddress(hostname=192.168.1.37, port=9020) total fragment num on current host: 0
I0927 14:12:39.462541 159997 fragment_mgr.cpp:758] Register query/load memory tracker, query/load id: 5f31ac24344e4c47-abfd3f3efe9b2c58 limit: 2.00 GB
I0927 14:12:39.463045 159997 data_consumer_group.cpp:111] start consumer group: 0e4573b9099d6821-ea027b26758a5bbd. max time(ms): 60000, batch rows: 200000, batch size: 209715200. id=5f31ac24344e4c47-abfd3f3efe9b2c58, job_id=160686, txn_id=82139, label=health-160686-5f31ac24344e4c47-abfd3f3efe9b2c58-82139, elapse(s)=0
I0927 14:12:39.463085 159961 fragment_mgr.cpp:528] PlanFragmentExecutor::_exec_actual|query_id=5f31ac24344e4c47-abfd3f3efe9b2c58|instance_id=5f31ac24344e4c47-abfd3f3efe9b2c59|pthread_id=140232152180480
I0927 14:12:39.463495 160763 tablets_channel.cpp:103] open tablets channel: (load_id=5f31ac24344e4c47-abfd3f3efe9b2c58, index_id=160425), tablets num: 128, timeout(s): 120
I0927 14:13:40.305856 159997 data_consumer_group.cpp:131] consumer group done: 0e4573b9099d6821-ea027b26758a5bbd. consume time(ms)=60842, received rows=30, received bytes=3211, eos: 1, left_time: -842, left_rows: 199970, left_bytes: 209711989, blocking get time(us): 60842733, blocking put time(us): 2, id=5f31ac24344e4c47-abfd3f3efe9b2c58, job_id=160686, txn_id=82139, label=health-160686-5f31ac24344e4c47-abfd3f3efe9b2c58-82139, elapse(s)=60
I0927 14:13:40.306563 159961 vtablet_sink.cpp:894] VNodeChannel[160425-130002], load_id=5f31ac24344e4c47-abfd3f3efe9b2c58, txn_id=82139, node=192.168.1.37:8060 mark closed, left pending batch size: 1
I0927 14:13:40.307304 160596 tablets_channel.cpp:145] close tablets channel: (load_id=5f31ac24344e4c47-abfd3f3efe9b2c58, index_id=160425), sender id: 0, backend id: 130002
I0927 14:13:40.307919 161034 vtablet_sink.cpp:1121] all node channels are stopped(maybe finished/offending/cancelled), sender thread exit. 5f31ac24344e4c47-abfd3f3efe9b2c58
I0927 14:13:40.324090 160596 load_channel.cpp:46] load channel removed. mem peak usage=0, info=label: LoadChannel#senderIp=192.168.1.37#loadID=5f31ac24344e4c47-abfd3f3efe9b2c58; consumption: 0; peak_consumption: 0; , load_id=5f31ac24344e4c47-abfd3f3efe9b2c58, is high priority=1, sender_ip=192.168.1.37
I0927 14:13:40.325825 159961 vtablet_sink.cpp:1528] total mem_exceeded_block_ns=0, total queue_push_lock_ns=0, total actual_consume_ns=115047, load id=5f31ac24344e4c47-abfd3f3efe9b2c58
I0927 14:13:40.325846 159961 vtablet_sink.cpp:1572] finished to close olap table sink. load_id=5f31ac24344e4c47-abfd3f3efe9b2c58, txn_id=82139, node add batch time(ms)/wait execution time(ms)/close time(ms)/num: {130002:(17)(0)(19)(1)}
I0927 14:13:40.326172 159961 query_context.h:69] Deregister query/load memory tracker, queryId=5f31ac24344e4c47-abfd3f3efe9b2c58, Limit=2.00 GB, CurrUsed=-24.48 KB, PeakUsed=25.31 MB
I0927 14:13:40.365257 159997 routine_load_task_executor.cpp:257] finished routine load task id=5f31ac24344e4c47-abfd3f3efe9b2c58, job_id=160686, txn_id=82139, label=health-160686-5f31ac24344e4c47-abfd3f3efe9b2c58-82139, elapse(s)=60, status: [OK], current tasks num: 2
muti-table routine load task log:
I0927 14:21:22.827615 351714 routine_load_task_executor.cpp:267] submit a new routine load task: id=71ae9956abab417a-a9d80e07f0150f21, job_id=161003, txn_id=82149, label=perf-161003-71ae9956abab417a-a9d80e07f0150f21-82149, elapse(s)=0, current tasks num: 4
I0927 14:21:22.827625 159997 routine_load_task_executor.cpp:285] begin to execute routine load task: id=71ae9956abab417a-a9d80e07f0150f21, job_id=161003, txn_id=82149, label=perf-161003-71ae9956abab417a-a9d80e07f0150f21-82149, elapse(s)=0
I0927 14:21:22.828194 159997 routine_load_task_executor.cpp:296] recv single-stream-multi-table request, ctx=id=71ae9956abab417a-a9d80e07f0150f21, job_id=161003, txn_id=82149, label=perf-161003-71ae9956abab417a-a9d80e07f0150f21-82149, elapse(s)=0
I0927 14:21:22.828397 159997 data_consumer_group.cpp:111] start consumer group: c74840f8c925b3cf-de3a780dee8213a6. max time(ms): 30000, batch rows: 200000, batch size: 209715200. id=71ae9956abab417a-a9d80e07f0150f21, job_id=161003, txn_id=82149, label=perf-161003-71ae9956abab417a-a9d80e07f0150f21-82149, elapse(s)=0
I0927 14:21:52.831889 159997 data_consumer_group.cpp:131] consumer group done: c74840f8c925b3cf-de3a780dee8213a6. consume time(ms)=30003, received rows=0, received bytes=0, eos: 1, left_time: -3, left_rows: 200000, left_bytes: 209715200, blocking get time(us): 30003469, blocking put time(us): 0, id=71ae9956abab417a-a9d80e07f0150f21, job_id=161003, txn_id=82149, label=perf-161003-71ae9956abab417a-a9d80e07f0150f21-82149, elapse(s)=30
searched by id. it looks like no more info after consumer group done.
Beta Was this translation helpful? Give feedback.
All reactions