-
-
Notifications
You must be signed in to change notification settings - Fork 285
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bug? Same message is received over and over #1967
Comments
I debugged it. The workaround is to set |
This sounds like a serious bug. |
My colleague reported something similar during debugging but it could be caused by different reasons like timeouts due to debugger pauses. I don't know any other ways to reproduce it without high load. The setup is pretty simple: 3 virtual actor types but only one is actually used. This actor has 3000 instances, requests are spread among them. The actor handles two types of requests: place order and cancel order. The benchmark sends up to 256 requests simultaneously without waiting for results (limited by My two colleagues were able to reproduce it using the same code. |
Many thanks for the report. I'm still trying to figure out what causes this, if it is inside the shared future implementation itself. |
I am not sure how much it relates to this, but we get these exceptions every now and then in production servers:
@rogeralsing I hope the callstacks help in some way. We have been ignoring it as it didn't result in messages being processed again and again for us. One thing to note is that we get these exceptions even though our per-actor throughput doesn't exceed 200 msgs/sec. |
cc @mhelleborg |
@AqlaSolutions Could you share a reproducing example? Trying to reproduce with just high load on shared futures has come up empty here, no issues. |
@mhelleborg I'm not sure cause NDA, you know... Also it will require some time to prepare and minimize it. I will let you know if I get the permission. |
I would not like your internal project, but a minimal reproducing example
would be be a great help.
…On Wed, Apr 19, 2023, 09:55 Vlad Dev ***@***.***> wrote:
@mhelleborg <https://github.com/mhelleborg> I'm not sure cause NDA, you
know... Also it will require some time to prepare and minimize it. I will
let you know if I get the permission.
—
Reply to this email directly, view it on GitHub
<#1967 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ADKYXYXJIERR3PHQBGNFPPDXB6K6FANCNFSM6AAAAAAWVKEO2U>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
I am having a similar problem. The OnReceive method keeps getting the same message over and over again if I send the message as follows:
However, if I send it with the MethodIndex, the problem does not occur:
|
the source code to reproduce the problem described by AqlaSolutions in the attachment |
Just run project benchmarks/PrototypeBenchmark from BEP.sln solution without debugger and in Release configuration. After a few minutes (may require several runs) you will see the console message:
|
I'm running the example right now and the first thing that comes to mind is that you are probably queueing up a lot of fire and forget tasks on the threadpool
The increasing latency might be that the threadpool is busy with other tasks.
Eventually, the entire threadpool queue might be filled with this kind of tasks. I'll dig deeper later today, but the increasing latency is very suspicious. |
@rogeralsing In this issue repro we don't have |
yes 👍🏻 |
We have a strange issue where under load over 200K RPS one of grain instances starts receiving the same request over and over. I'm sure that we don't send this request multiple times. And of course responses from such repeated requests never reach original request sender. We run everything on the same machine without remoting. The issue appears usually after few minutes of load when I use Release configuration and run without debugging (but later I attach after the issue reproduces).
Can you suggest where to start debugging it?
The text was updated successfully, but these errors were encountered: