You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Uploaded File chunks rerouted to different pods. Which is giving us invalid chunk number errors
If we have autoscaling, does this mean that each of those 10 chunks gets sent to a different pod?
What is the current bug behavior?
When you upload files to WIPP, you cut a file up into 1MB chunks rather than just creating a continuous file stream
So if we have a 10MB file, we send 10 (1MB) chunks of files to the backend
Nginx is routing different file chunk parts to different pods
If you look at the logs attached, it is showing that there are invalid flow chunks
What is the expected correct behavior?
No chunk errors should be there when the file conversion happen
Steps to reproduce
@Nicholas-Schaub Uploading 1500 Images
5 Replica K8s running which is scaled using Horizontal pod Autoscaler (HPA)
Min Pods 1 and Max pods 5
Cpu Requests: 1
Cpu Limits: 2
Relevant screenshots and/or logs
pod 1 running initially And
pod2 & pod3 started by autoscaling activity pod1.txt pod2.txt pod3.txt
Hi @tejavegesna is it still happening after the 2 additional pods have been running for a while? Or when you were hard-coding the number of replica (as opposed to autoscaling)?
Looking at the time stamps in the logs, I am wondering if it might be an issue with the readiness of the pod/app (since we don't have readiness probe for wipp-backend, the pod might be ready before the app actually is). So some of the chunks would have been sent to pods 2 and 3 right when the autoscaling kicked in (and before they were actually ready to receive the chunks) and then that would mess up the whole chunk registration and image conversion.
Not saying this is the only issue here, but just wanted to check that first if you get a chance to test that.
@tejavegesna thanks for testing, I checked the backend chunk upload code and there is a ConcurrentMap there that I am afraid might not be playing well with the pod replication... But I will investigate a bit more to make sure this is the issue here. In the meantime can you guys fo back to 1 for the number of replica and do some scaling with the ome.converter.threads value?
Summary
Uploaded File chunks rerouted to different pods. Which is giving us invalid chunk number errors
If we have autoscaling, does this mean that each of those 10 chunks gets sent to a different pod?
What is the current bug behavior?
When you upload files to WIPP, you cut a file up into 1MB chunks rather than just creating a continuous file stream
So if we have a 10MB file, we send 10 (1MB) chunks of files to the backend
Nginx is routing different file chunk parts to different pods
If you look at the logs attached, it is showing that there are invalid flow chunks
What is the expected correct behavior?
No chunk errors should be there when the file conversion happen
Steps to reproduce
@Nicholas-Schaub Uploading 1500 Images
5 Replica K8s running which is scaled using Horizontal pod Autoscaler (HPA)
Min Pods 1 and Max pods 5
Cpu Requests: 1
Cpu Limits: 2
Relevant screenshots and/or logs
pod 1 running initially And
pod2 & pod3 started by autoscaling activity
pod1.txt
pod2.txt
pod3.txt
Environment info
labshare/wipp-backend:3.0.0-generic
Possible fixes
Not So Sure
cc: @Nicholas-Schaub
The text was updated successfully, but these errors were encountered: