-
Notifications
You must be signed in to change notification settings - Fork 191
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ORA-50000: Connection request timed out - On Async Operations #411
Comments
I did some regression on the drivers present on the nuger.org and these are the results:
Something seems to change from version 3.21.150 to 23.4.0 that broke async operations on my cenario |
Prior to ODP.NET version 23.x, the async APIs called the sync behavior. Async APIs are part of the ADO.NET spec and are required of all ADO.NET providers whether they implement actually async behavior or not. If async implementations did not exist, then providers should map their async calls to the traditional sync behavior. In ODP.NET 23.x started supported a true async implementation. The difference in behavior between versions 21 and 23 are likely explained by this difference in implementation. With respect to the ORA-50000 error, async can open connections much faster than sync behavior. After all, that's main reason developers want to use async: less latency. If opening connections is faster, it's possible to exhaust the connection pool faster. This could be hitting the maximum number of connections in Max Pool Size, requesting one more, and getting the timeout. Something like increasing the Max Pool Size value could resolve this. It could be your app creates connections faster than the DB can dispense them. The connection dispensation falls behind until finally one request hits the timeout. Something like increasing the Connect Timeout would provide more time for ODP.NET to get a connection dispensed. If you need to warm up the pool faster, you could set a higher Min Pool Size and/or Incr Pool Size. An ODP.NET trace would provide info on the timeout reason. |
Okay make sense, but.... even if I disable the Pooling = false on the connection string when using the code provided by you guys (https://github.com/oracle/dotnet-db-samples/blob/master/samples/async/async.cs) on the stress test scenario the One or more errors occurred. (ORA-50000: Connection request timed out) and our DBA tell us that no connection was established on the oracle side all of it does not leave the app server, and again using the sample provided by you guys. Even puting the following configuration Min Pool Size=512;Max Pool Size=1000;Incr Pool Size=20;Decr Pool Size=10;Connection Timeout=60 no connection was open on the server side and all of then recieves ORA-50000: Connection request timed out. Again using the same code provided on the examples
There is anythink that I can do to help troubleshoot the problem? |
If no connection is established, that could mean that the DB is so overloaded with requests that it can't even deliver one connection. That's really unusual. How many connection requests are you trying to make? If you're going to warm up the pool, you shouldn't immediately start requesting lots of connections. ODP.NET will start to populate your pool to the Min Pool Size after the first Open() call. The warm up period works like this:
Fundamentally, this is a performance and throughput issue, not a functional issue. Those type of issues are dependent on hardware, software, configuration, network resources, and the load. For example, if the DB is slow issuing connections, you get a timeout. In that case, you have to look for ways to speed up the DB to serve how quickly connections are coming in. If that's not possible, admins can set an Oracle listener rate limiter, which will slow down how quickly connection requests are coming in to prevent the DB from being swamped. Now, it's possible setting this rate limit could lead to timeouts. To start troubleshooting, generate an ODP.NET trace at level 7. The trace will have more ODP.NET status as each connection request comes in. |
The log is quite big this is a section from what I get from it:
|
@WalterDias Can you email the full trace and all the trace files to dotnet_us(at)oracle.com? I see the trace jump 30 seconds from 2024-09-26 18:45:08.477994 to 2024-09-26 18:45:38.475726 with nothing written to the trace in between. Since it's async, activity could be occurring on different threads, which means if you set up TraceOption of 1, the other threads will be written to another trace file. |
how to enable this logging |
@CavidH to enable the the log just use the following statments:
I'm base my configuration on this: https://github.com/oracle/dotnet-db-samples/blob/master/samples/configuration-api/configuration-class.cs @alexkeh indeed the options is was 1 change to 0 as sugested, and here is the log WEBTESTCONNECTION.DLL_PID_2615_DATE_2024_09_27_TIME_09_28_02_279799.zip |
This has been very helpful to me, thank you. I should mention that I encountered this error while load testing my services (after async migration). I think that opening async connections happens too quickly and the Oracle TNS listener can't handle it. Maybe I’m wrong :) |
Thanks @WalterDias! We'll review the trace. @CavidH We'll see what the trace has. You can set an Oracle listener connection rate limit if you need to slow down the requests coming into the listener so that the DB is not overwhelmed by connection requests if that is the bottleneck. |
How exactly will it limit the connections? Will requests exceeding the limit wait, or will it throw an exception? |
Yes, the listener will prevent more than defined rate of connections from coming in. If those connections have to wait beyond the timeout value they will have connection timeout error. If it's a problem, consider solutions similar to dealing with any other connection storms. Figure out where your bottleneck is and increase capacity for that bottleneck. For example, if the DB requires more CPUs to handle a load surge, increase CPUs in real time, which can be done with Oracle Autonomous Database. Another alternative is to spread the load so that connection opening isn't concentrated at one time. For example, if you know that all your users log in at 9 AM in the morning, create a larger pool of connections ready to dispense before 9 AM so that fewer new connections need to be created. This can be done with a high Min Pool Size and with a larger Incr Pool Size so that your pool can grow fast. |
Thank you for the valuable information |
Thanks for the support |
@alexkeh any thoughts about the problem? |
@WalterDias I haven't heard back from my dev team yet. We are planning to ship ODP.NET 23.6 soon. So, they are preoccupied with getting that release out. |
We've been able to reproduce this issue in house and are analyzing it further. Bug 37144163 has been filed to track the issue and eventual resolution. |
Thx for the support @alexkeh, I perform the same tests with 23.6 and de issue still there. I you guys need any further test let me know. At. |
@WalterDias
This will create 512 worker threads available for creating the connections. You can adjust this level up or down based on the number of users you're creating in rapid succession. In our tests, when we increased the thread pool, it resolved the timeout problem. If it does the same for you, then we can be fairly sure the lack of threads is the problem in your case as well. |
Hi @alexkeh I did the test and work fine, but as you can se on the video it took 7 seconds to response start to come (#412). My only concern to change this value from the standard one was the unknown impact mentioned on https://learn.microsoft.com/en-us/dotnet/api/system.threading.threadpool.setminthreads?view=net-8.0 2024-10-08.10-48-18.mp4feel free to reach out to any further information all the tests were performed at version 23.6.0, driver log from tests. This file is compress with 7z but since github only allows .zip was compressed 2x lol Uploading trace.7z.zip… |
@WalterDias That's interesting! Thanks for trying it out. The link to the 7z.zip isn't working. Can you fix it? Or you can email the trace to dotnet_us(at)oracle.com if it's not too big to send via email. |
@alexkeh the file was zip and inside has a 7z file. |
@WalterDias The link itself is not valid. There is no download for me to look at. The link leads to an invalid GitHub page: https://github.com/oracle/dotnet-db-samples/issues/Logs |
@WalterDias I reviewed your trace. The overall test run took just over 11 seconds using ODP.NET. As I scroll through the trace, I don't see any single operation in which there is 7 second delay as in issue #412. In that issue, we see that as ODP.NET tries the DB IP takes over 7 seconds. For the same operation, ODP.NET executes quickly:
|
Okay @alexkeh I will add SetMinThreads as workaround as suggested and await for the fix. I did this stress test to simulate what happen eventualy in our production environment, since we are a retail if more than 80 stores and a online marketplace so we have some bursts. |
There will be a new release (23.6.1) that will fix this bug. |
We are using In this case |
We should have ODP.NET 23.6.1 available shortly to likely resolve this scalability issue. |
Hello every one.
I'm running into a ORA-50000: Connection request timed out when use Async operations on Oracle.ManagedDataAccess.Core Version 23.5.1. o Docker Desktop 4.34.2 (167172) using .Net 8 Image (mcr.microsoft.com/dotnet/aspnet:8.0) and Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production Version 19.22.0.0.0
My code was based on the example https://github.com/oracle/dotnet-db-samples/blob/master/samples/async/async.cs but without Thread.Sleep(1000); present on line 26
Below is the code I use, just change the parameters User, Password, Host, Database:
In stress test scenarios that I wrote in K6, ORA-50000: Connection request timed out is raised. The same does not happen when I use non-asynchronous ones.
k6 script that I use, with the following command: k6 run .\script-orders.js -u 512 -d 60s
The text was updated successfully, but these errors were encountered: