Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ORA-50000: Connection request timed out - On Async Operations #411

Open
WalterDias opened this issue Sep 25, 2024 · 28 comments
Open

ORA-50000: Connection request timed out - On Async Operations #411

WalterDias opened this issue Sep 25, 2024 · 28 comments
Labels

Comments

@WalterDias
Copy link

WalterDias commented Sep 25, 2024

Hello every one.

I'm running into a ORA-50000: Connection request timed out when use Async operations on Oracle.ManagedDataAccess.Core Version 23.5.1. o Docker Desktop 4.34.2 (167172) using .Net 8 Image (mcr.microsoft.com/dotnet/aspnet:8.0) and Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production Version 19.22.0.0.0

My code was based on the example https://github.com/oracle/dotnet-db-samples/blob/master/samples/async/async.cs but without Thread.Sleep(1000); present on line 26

Below is the code I use, just change the parameters User, Password, Host, Database:

 ConnectionString = "User Id=<User>;Password=<Password>;Data Source=(DESCRIPTION=(ADDRESS_LIST=(ADDRESS=(PROTOCOL=TCP)(HOST=<Host>)(PORT=1521)))(CONNECT_DATA=(SERVER=DEDICATED)(SERVICE_NAME=<Database>)));";
 using (var connection = new OracleConnection(ConnectionString))
 {
     Task task = connection.OpenAsync();

     using (OracleCommand cmd = new OracleCommand(query, connection))
     {
         await task;
         using (var rdr = await cmd.ExecuteReaderAsync())
         {
             while (await rdr.ReadAsync()) { }
         }
     }
 } 

In stress test scenarios that I wrote in K6, ORA-50000: Connection request timed out is raised. The same does not happen when I use non-asynchronous ones.

k6 script that I use, with the following command: k6 run .\script-orders.js -u 512 -d 60s

import http from 'k6/http';
import { check } from 'k6';

export default function () {
    const url = 'https://localhost:32777/weatherforecast';
  const params = {
    headers: {
      'accept': 'application/json'
      },
  };
  
  const res = http.get(url, params);

  check(res, {
    'status was 200': (r) => r.status === 200,
  });
}
@WalterDias
Copy link
Author

I did some regression on the drivers present on the nuger.org and these are the results:

Version Status
23.5.1 Fails
23.5.0 Fails
23.4.0 Fails
3.21.150 works

Something seems to change from version 3.21.150 to 23.4.0 that broke async operations on my cenario

@alexkeh
Copy link
Member

alexkeh commented Sep 25, 2024

Prior to ODP.NET version 23.x, the async APIs called the sync behavior. Async APIs are part of the ADO.NET spec and are required of all ADO.NET providers whether they implement actually async behavior or not. If async implementations did not exist, then providers should map their async calls to the traditional sync behavior.

In ODP.NET 23.x started supported a true async implementation. The difference in behavior between versions 21 and 23 are likely explained by this difference in implementation.

With respect to the ORA-50000 error, async can open connections much faster than sync behavior. After all, that's main reason developers want to use async: less latency.

If opening connections is faster, it's possible to exhaust the connection pool faster. This could be hitting the maximum number of connections in Max Pool Size, requesting one more, and getting the timeout. Something like increasing the Max Pool Size value could resolve this.

It could be your app creates connections faster than the DB can dispense them. The connection dispensation falls behind until finally one request hits the timeout. Something like increasing the Connect Timeout would provide more time for ODP.NET to get a connection dispensed.

If you need to warm up the pool faster, you could set a higher Min Pool Size and/or Incr Pool Size.

An ODP.NET trace would provide info on the timeout reason.

@WalterDias
Copy link
Author

WalterDias commented Sep 25, 2024

Okay make sense, but....

even if I disable the Pooling = false on the connection string when using the code provided by you guys (https://github.com/oracle/dotnet-db-samples/blob/master/samples/async/async.cs) on the stress test scenario the One or more errors occurred. (ORA-50000: Connection request timed out) and our DBA tell us that no connection was established on the oracle side all of it does not leave the app server, and again using the sample provided by you guys.

Even puting the following configuration Min Pool Size=512;Max Pool Size=1000;Incr Pool Size=20;Decr Pool Size=10;Connection Timeout=60 no connection was open on the server side and all of then recieves ORA-50000: Connection request timed out. Again using the same code provided on the examples

using (OracleConnection con = new OracleConnection(ConnectionString))
{
    Task task = con.OpenAsync();

    string cmdText = "SELECT * FROM TELE FETCH FIRST 100 ROWS ONLY";
    using (OracleCommand cmd = new OracleCommand(cmdText, con))
    {
        // Retrieve open connection
        await task;
        using (OracleDataReader reader = await cmd.ExecuteReaderAsync())
        {
            await reader.ReadAsync();
        }
    }
}

There is anythink that I can do to help troubleshoot the problem?

@alexkeh
Copy link
Member

alexkeh commented Sep 26, 2024

If no connection is established, that could mean that the DB is so overloaded with requests that it can't even deliver one connection. That's really unusual. How many connection requests are you trying to make?

If you're going to warm up the pool, you shouldn't immediately start requesting lots of connections. ODP.NET will start to populate your pool to the Min Pool Size after the first Open() call. The warm up period works like this:

  1. Call Open()
  2. Give ODP.NET time to build up to your Min Pool Size
  3. Start requesting connections once the pool is populated

Fundamentally, this is a performance and throughput issue, not a functional issue. Those type of issues are dependent on hardware, software, configuration, network resources, and the load.

For example, if the DB is slow issuing connections, you get a timeout. In that case, you have to look for ways to speed up the DB to serve how quickly connections are coming in. If that's not possible, admins can set an Oracle listener rate limiter, which will slow down how quickly connection requests are coming in to prevent the DB from being swamped. Now, it's possible setting this rate limit could lead to timeouts.

To start troubleshooting, generate an ODP.NET trace at level 7. The trace will have more ODP.NET status as each connection request comes in.

@WalterDias
Copy link
Author

The log is quite big this is a section from what I get from it:

2024-09-26 18:45:08.477922 TID:20  (PUB) (ENT) OracleConnection.OpenAsync()
2024-09-26 18:45:08.477926 TID:20  (PUB) (ENT) OracleConnection.OpenAsyncHelper()
2024-09-26 18:45:08.477932 TID:20  (PUB) (ENT) OracleConnection.OpenInternal() (conid=17882254) (state=Connecting) (sessid=0) (implid=0) (pooling=T) (txnid=n/a) 
2024-09-26 18:45:08.477936 TID:20  (PRI) (ENT) (CP) OracleConnectionDispenser.Get()
2024-09-26 18:45:08.477941 TID:20  (PRI) (ENT) (CP) OraclePoolManager.Get()
2024-09-26 18:45:08.477949 TID:20  (PRI) (ENT) (CP) PoolManager.Get() (txnid=n/a) (bForceMatch=F)
2024-09-26 18:45:08.477955 TID:20  (PRI) (ENT) (CP) PoolManager.Get() MultiTenant : Searching for a idle connection, retryCountWithoutAffinity: 0 (pm.get.loop.count=0)
2024-09-26 18:45:08.477959 TID:20  (PRI) (ENT) (CP) PoolManager.Get() WaitHandle.WaitAny(m_IdleAndMaxHandles) : (bIdleSemObtained=False) (bMaxSemObtained=False) (bIncrSemObtained=False) (m_pmListPR.Count=28)
2024-09-26 18:45:08.477966 TID:20  (PRI) (EXT) (CP) PoolManager.Get() WaitHandle.WaitAny(m_IdleAndMaxHandles) : (hIndex = 1) (max semaphore) : (bIdleSemObtained=False) (bMaxSemObtained=True) (bIncrSemObtained=False) (m_pmListPR.Count=28)
2024-09-26 18:45:08.477972 TID:20  (PRI) (ENT) PoolManager.ProcessCriteriaCtx_NonEnlistedConnection()
2024-09-26 18:45:08.477976 TID:20  (PRI) (EXT) PoolManager.ProcessCriteriaCtx_NonEnlistedConnection()
2024-09-26 18:45:08.477979 TID:20  (PRI) (ENT) (CP) PoolManager.Get() WaitOne(m_semIncrPoolSize) : (bIdleSemObtained=False) (bMaxSemObtained=True) (bIncrSemObtained=False) (m_pmListPR.Count=28)
2024-09-26 18:45:08.477984 TID:20  (PRI) (EXT) (CP) PoolManager.Get() WaitOne(m_semIncrPoolSize) : (bIdleSemObtained=False) (bMaxSemObtained=True) (bIncrSemObtained=False) (m_pmListPR.Count=28)
2024-09-26 18:45:08.477990 TID:20  (PRI) (ENT) (CP) PoolManager.Get() MultiTenant : Searching for a idle connection, retryCountWithoutAffinity: 0 (pm.get.loop.count=1)
2024-09-26 18:45:08.477994 TID:20  (PRI) (ENT) (CP) PoolManager.Get() WaitAny(m_IdleAndIncrHandles) : (bIdleSemObtained=False) (bMaxSemObtained=True) (bIncrSemObtained=False) (m_pmListPR.Count=28)
2024-09-26 18:45:38.475726 TID:20  (PRI) (ENT) PoolManager.ProcessCriteriaCtx_NonEnlistedConnection()
2024-09-26 18:45:38.475816 TID:20  (PRI) (EXT) PoolManager.ProcessCriteriaCtx_NonEnlistedConnection()
2024-09-26 18:45:38.475852 TID:20  (PRI) (EXT) (CP) PoolManager.Get() WaitHandle.WaitAny() : (hIndex = WaitHandle.WaitTimeout) (TIMEOUT) (B) : (bIdleSemObtained=False) (bMaxSemObtained=True) (bIncrSemObtained=False) (m_pmListPR.Count=28)
2024-09-26 18:45:38.475865 TID:20  (PRI) (CP)  PoolManager.Get() [MAX:REL:21] PM.Get() : Max Sempahore count=865
2024-09-26 18:45:38.475880 TID:20  (PRI) (CP)  PoolManager.Get() [MAX:REL:12] PM.Get() : Max Sempahore count=865
2024-09-26 18:45:38.475884 TID:20  (PRI) (ENT) (CP) PoolManager.Get() (TIMEOUT) (C)
2024-09-26 18:45:38.475974 TID:20  (PRI) (CP)  PoolManager.Get() (DUMP)
(CP) =================================
(CP) Data Source=oracle;Pooling=true;User Id=test;Password=;Min Pool Size=64;Max Pool Size=1000;Incr Pool Size=20;Decr Pool Size=10;Connection Timeout=30
(CP) pmid=39774547
(CP) =================================
(CP) dbprderp , DBPRDERP1:
(CP)   list  (count:10) : 
(2196:20485:T:F:F:F:False:N:dbprderp::DBPRDERP1)
(2451:32396:T:F:F:F:False:N:dbprderp::DBPRDERP1)
(1965:9346:T:F:F:F:False:N:dbprderp::DBPRDERP1)
(496:45665:T:F:F:F:False:N:dbprderp::DBPRDERP1)
(1986:25899:T:F:F:F:False:N:dbprderp::DBPRDERP1)
(510:29255:T:F:F:F:False:N:dbprderp::DBPRDERP1)
(1333:40137:T:F:F:F:False:N:dbprderp::DBPRDERP1)
(1317:39357:T:F:F:F:False:N:dbprderp::DBPRDERP1)
(1324:54939:T:F:F:F:False:N:dbprderp::DBPRDERP1)
(2393:31637:T:F:F:F:False:N:dbprderp::DBPRDERP1)

(CP)   queue (count:0) : 
(CP) dbprderp , DBPRDERP2:
(CP)   list  (count:18) : 
(1105:53623:T:F:F:F:False:N:dbprderp::DBPRDERP2)
(2002:33363:T:F:F:F:False:N:dbprderp::DBPRDERP2)
(2886:50980:T:F:F:F:False:N:dbprderp::DBPRDERP2)
(2014:30062:T:F:F:F:False:N:dbprderp::DBPRDERP2)
(1114:27252:T:F:F:F:False:N:dbprderp::DBPRDERP2)
(2224:60978:T:F:F:F:False:N:dbprderp::DBPRDERP2)
(1538:21604:T:F:F:F:False:N:dbprderp::DBPRDERP2)
(479:11529:T:F:F:F:False:N:dbprderp::DBPRDERP2)
(1976:44857:T:F:F:F:False:N:dbprderp::DBPRDERP2)
(1112:54965:T:F:F:F:False:N:dbprderp::DBPRDERP2)
(2179:58272:T:F:F:F:False:N:dbprderp::DBPRDERP2)
(1807:694:T:F:F:F:False:N:dbprderp::DBPRDERP2)
(227:33403:T:F:F:F:False:N:dbprderp::DBPRDERP2)
(675:44164:T:F:F:F:False:N:dbprderp::DBPRDERP2)
(483:56542:T:F:F:F:False:N:dbprderp::DBPRDERP2)
(1545:33881:T:F:F:F:False:N:dbprderp::DBPRDERP2)
(1742:14104:T:F:F:F:False:N:dbprderp::DBPRDERP2)
(20:46000:T:F:F:F:False:N:dbprderp::DBPRDERP2)

(CP)   queue (count:0) : 
get::timeout::2
2024-09-26 18:45:38.475995 TID:20  (PRI) (ENT) OracleException.ctor()
2024-09-26 18:45:38.476005 TID:20  (PRI) (ENT) OracleError.ctor()
2024-09-26 18:45:38.476016 TID:20  (PRI) (EXT) OracleError.ctor()
2024-09-26 18:45:38.476021 TID:20  (PRI) (ENT) OracleError.ctor()
2024-09-26 18:45:38.476024 TID:20  (PRI) (EXT) OracleError.ctor()
2024-09-26 18:45:38.476028 TID:20  (PRI) (EXT) OracleException.ctor()
2024-09-26 18:45:38.476146 TID:20  (PRI) (EXT) (CP) PoolManager.Get() (txnid=n/a) PM.Get(aff=;force=F) returning (null)
2024-09-26 18:45:38.476284 TID:20  (PRI) (ENT) OracleException.HandleError() from OraclePoolManager.Get()
2024-09-26 18:45:38.476544 TID:20  (PRI) (ERR) (CP) OracleException.HandleError() from OraclePoolManager.Get()(txnid=n/a) Oracle.ManagedDataAccess.Client.OracleException (0x80004005): ORA-50000: Connection request timed out
   at OracleInternal.ConnectionPool.PoolManager`3.GetAsync(ConnectionString csWithDiffOrNewPassw, Boolean bGetForApp, OracleConnection connRefForCriteria, String affinityInstanceName, Boolean bForceMatch, Boolean bAsync)
   at OracleInternal.ConnectionPool.OraclePoolManager.GetAsync(ConnectionString csWithNewPassw, Boolean bGetForApp, OracleConnection connRefForCriteria, String affinityInstanceName, Boolean bForceMatch, Boolean bAsync)
2024-09-26 18:45:38.476557 TID:20  (PRI) (EXT) OracleException.HandleError() from OraclePoolManager.Get()
2024-09-26 18:45:38.476603 TID:20  (PRI) (EXT) (CP) OraclePoolManager.Get() (txnid=n/a) 
2024-09-26 18:45:38.476720 TID:20  (PRI) (ENT) OracleException.HandleError() from OracleConnectionDispenser.Get()
2024-09-26 18:45:38.476964 TID:20  (PRI) (ERR) (CP) OracleException.HandleError() from OracleConnectionDispenser.Get()(txnid=n/a) Oracle.ManagedDataAccess.Client.OracleException (0x80004005): ORA-50000: Connection request timed out
   at OracleInternal.ConnectionPool.PoolManager`3.GetAsync(ConnectionString csWithDiffOrNewPassw, Boolean bGetForApp, OracleConnection connRefForCriteria, String affinityInstanceName, Boolean bForceMatch, Boolean bAsync)
   at OracleInternal.ConnectionPool.OraclePoolManager.GetAsync(ConnectionString csWithNewPassw, Boolean bGetForApp, OracleConnection connRefForCriteria, String affinityInstanceName, Boolean bForceMatch, Boolean bAsync)
   at OracleInternal.ConnectionPool.OracleConnectionDispenser`3.GetAsync(ConnectionString cs, PM conPM, ConnectionString pmCS, OracleOpaqueString securedPassw, OracleOpaqueString securedProxyPassw, OracleConnection connRefForCriteria, Boolean bAsync)
2024-09-26 18:45:38.476975 TID:20  (PRI) (EXT) OracleException.HandleError() from OracleConnectionDispenser.Get()
2024-09-26 18:45:38.477004 TID:20  (PRI) (EXT) (CP) OracleConnectionDispenser.Get() (txnid=n/a) 
2024-09-26 18:45:38.477047 TID:20  (PRI) (ENT) OracleException.HandleError() from OracleConnection.OpenInternal()
2024-09-26 18:45:38.477338 TID:20  (PUB) (ERR) OracleException.HandleError() from OracleConnection.OpenInternal()(txnid=n/a) Oracle.ManagedDataAccess.Client.OracleException (0x80004005): ORA-50000: Connection request timed out
   at OracleInternal.ConnectionPool.PoolManager`3.GetAsync(ConnectionString csWithDiffOrNewPassw, Boolean bGetForApp, OracleConnection connRefForCriteria, String affinityInstanceName, Boolean bForceMatch, Boolean bAsync)
   at OracleInternal.ConnectionPool.OraclePoolManager.GetAsync(ConnectionString csWithNewPassw, Boolean bGetForApp, OracleConnection connRefForCriteria, String affinityInstanceName, Boolean bForceMatch, Boolean bAsync)
   at OracleInternal.ConnectionPool.OracleConnectionDispenser`3.GetAsync(ConnectionString cs, PM conPM, ConnectionString pmCS, OracleOpaqueString securedPassw, OracleOpaqueString securedProxyPassw, OracleConnection connRefForCriteria, Boolean bAsync)
   at Oracle.ManagedDataAccess.Client.OracleConnectionInternal.OpenInternalAsync(Boolean bAsync)
2024-09-26 18:45:38.477369 TID:20  (PRI) (EXT) OracleException.HandleError() from OracleConnection.OpenInternal()
2024-09-26 18:45:38.477411 TID:20  (PUB) (EXT) OracleConnection.OpenInternal() (conid=17882254) (state=Connecting) (sessid=0) (implid=0) (pooling=T) (txnid=n/a) 
2024-09-26 18:45:38.477488 TID:20  (PUB)       Task Faulted for OracleConnectionHashCode: 17882254
2024-09-26 18:45:38.477924 TID:20  (PUB)       Set Exception on Task: Oracle.ManagedDataAccess.Client.OracleException (0x80004005): ORA-50000: Connection request timed out
   at OracleInternal.ConnectionPool.PoolManager`3.GetAsync(ConnectionString csWithDiffOrNewPassw, Boolean bGetForApp, OracleConnection connRefForCriteria, String affinityInstanceName, Boolean bForceMatch, Boolean bAsync)
   at OracleInternal.ConnectionPool.OraclePoolManager.GetAsync(ConnectionString csWithNewPassw, Boolean bGetForApp, OracleConnection connRefForCriteria, String affinityInstanceName, Boolean bForceMatch, Boolean bAsync)
   at OracleInternal.ConnectionPool.OracleConnectionDispenser`3.GetAsync(ConnectionString cs, PM conPM, ConnectionString pmCS, OracleOpaqueString securedPassw, OracleOpaqueString securedProxyPassw, OracleConnection connRefForCriteria, Boolean bAsync)
   at Oracle.ManagedDataAccess.Client.OracleConnectionInternal.OpenInternalAsync(Boolean bAsync)
   at Oracle.ManagedDataAccess.Client.OracleConnectionInternal.OpenAsyncHelper(CancellationToken cancellationToken)
   at Oracle.ManagedDataAccess.Client.OracleConnectionInternal.OpenAsyncHelper(CancellationToken cancellationToken)

@alexkeh
Copy link
Member

alexkeh commented Sep 26, 2024

@WalterDias Can you email the full trace and all the trace files to dotnet_us(at)oracle.com?

I see the trace jump 30 seconds from 2024-09-26 18:45:08.477994 to 2024-09-26 18:45:38.475726 with nothing written to the trace in between. Since it's async, activity could be occurring on different threads, which means if you set up TraceOption of 1, the other threads will be written to another trace file.

@CavidH
Copy link

CavidH commented Sep 27, 2024

The log is quite big this is a section from what I get from it:

2024-09-26 18:45:08.477922 TID:20  (PUB) (ENT) OracleConnection.OpenAsync()
2024-09-26 18:45:08.477926 TID:20  (PUB) (ENT) OracleConnection.OpenAsyncHelper()
2024-09-26 18:45:08.477932 TID:20  (PUB) (ENT) OracleConnection.OpenInternal() (conid=17882254) (state=Connecting) (sessid=0) (implid=0) (pooling=T) (txnid=n/a) 
2024-09-26 18:45:08.477936 TID:20  (PRI) (ENT) (CP) OracleConnectionDispenser.Get()
2024-09-26 18:45:08.477941 TID:20  (PRI) (ENT) (CP) OraclePoolManager.Get()
2024-09-26 18:45:08.477949 TID:20  (PRI) (ENT) (CP) PoolManager.Get() (txnid=n/a) (bForceMatch=F)
2024-09-26 18:45:08.477955 TID:20  (PRI) (ENT) (CP) PoolManager.Get() MultiTenant : Searching for a idle connection, retryCountWithoutAffinity: 0 (pm.get.loop.count=0)
2024-09-26 18:45:08.477959 TID:20  (PRI) (ENT) (CP) PoolManager.Get() WaitHandle.WaitAny(m_IdleAndMaxHandles) : (bIdleSemObtained=False) (bMaxSemObtained=False) (bIncrSemObtained=False) (m_pmListPR.Count=28)
2024-09-26 18:45:08.477966 TID:20  (PRI) (EXT) (CP) PoolManager.Get() WaitHandle.WaitAny(m_IdleAndMaxHandles) : (hIndex = 1) (max semaphore) : (bIdleSemObtained=False) (bMaxSemObtained=True) (bIncrSemObtained=False) (m_pmListPR.Count=28)
2024-09-26 18:45:08.477972 TID:20  (PRI) (ENT) PoolManager.ProcessCriteriaCtx_NonEnlistedConnection()
2024-09-26 18:45:08.477976 TID:20  (PRI) (EXT) PoolManager.ProcessCriteriaCtx_NonEnlistedConnection()
2024-09-26 18:45:08.477979 TID:20  (PRI) (ENT) (CP) PoolManager.Get() WaitOne(m_semIncrPoolSize) : (bIdleSemObtained=False) (bMaxSemObtained=True) (bIncrSemObtained=False) (m_pmListPR.Count=28)
2024-09-26 18:45:08.477984 TID:20  (PRI) (EXT) (CP) PoolManager.Get() WaitOne(m_semIncrPoolSize) : (bIdleSemObtained=False) (bMaxSemObtained=True) (bIncrSemObtained=False) (m_pmListPR.Count=28)
2024-09-26 18:45:08.477990 TID:20  (PRI) (ENT) (CP) PoolManager.Get() MultiTenant : Searching for a idle connection, retryCountWithoutAffinity: 0 (pm.get.loop.count=1)
2024-09-26 18:45:08.477994 TID:20  (PRI) (ENT) (CP) PoolManager.Get() WaitAny(m_IdleAndIncrHandles) : (bIdleSemObtained=False) (bMaxSemObtained=True) (bIncrSemObtained=False) (m_pmListPR.Count=28)
2024-09-26 18:45:38.475726 TID:20  (PRI) (ENT) PoolManager.ProcessCriteriaCtx_NonEnlistedConnection()
2024-09-26 18:45:38.475816 TID:20  (PRI) (EXT) PoolManager.ProcessCriteriaCtx_NonEnlistedConnection()
2024-09-26 18:45:38.475852 TID:20  (PRI) (EXT) (CP) PoolManager.Get() WaitHandle.WaitAny() : (hIndex = WaitHandle.WaitTimeout) (TIMEOUT) (B) : (bIdleSemObtained=False) (bMaxSemObtained=True) (bIncrSemObtained=False) (m_pmListPR.Count=28)
2024-09-26 18:45:38.475865 TID:20  (PRI) (CP)  PoolManager.Get() [MAX:REL:21] PM.Get() : Max Sempahore count=865
2024-09-26 18:45:38.475880 TID:20  (PRI) (CP)  PoolManager.Get() [MAX:REL:12] PM.Get() : Max Sempahore count=865
2024-09-26 18:45:38.475884 TID:20  (PRI) (ENT) (CP) PoolManager.Get() (TIMEOUT) (C)
2024-09-26 18:45:38.475974 TID:20  (PRI) (CP)  PoolManager.Get() (DUMP)
(CP) =================================
(CP) Data Source=oracle;Pooling=true;User Id=test;Password=;Min Pool Size=64;Max Pool Size=1000;Incr Pool Size=20;Decr Pool Size=10;Connection Timeout=30
(CP) pmid=39774547
(CP) =================================
(CP) dbprderp , DBPRDERP1:
(CP)   list  (count:10) : 
(2196:20485:T:F:F:F:False:N:dbprderp::DBPRDERP1)
(2451:32396:T:F:F:F:False:N:dbprderp::DBPRDERP1)
(1965:9346:T:F:F:F:False:N:dbprderp::DBPRDERP1)
(496:45665:T:F:F:F:False:N:dbprderp::DBPRDERP1)
(1986:25899:T:F:F:F:False:N:dbprderp::DBPRDERP1)
(510:29255:T:F:F:F:False:N:dbprderp::DBPRDERP1)
(1333:40137:T:F:F:F:False:N:dbprderp::DBPRDERP1)
(1317:39357:T:F:F:F:False:N:dbprderp::DBPRDERP1)
(1324:54939:T:F:F:F:False:N:dbprderp::DBPRDERP1)
(2393:31637:T:F:F:F:False:N:dbprderp::DBPRDERP1)

(CP)   queue (count:0) : 
(CP) dbprderp , DBPRDERP2:
(CP)   list  (count:18) : 
(1105:53623:T:F:F:F:False:N:dbprderp::DBPRDERP2)
(2002:33363:T:F:F:F:False:N:dbprderp::DBPRDERP2)
(2886:50980:T:F:F:F:False:N:dbprderp::DBPRDERP2)
(2014:30062:T:F:F:F:False:N:dbprderp::DBPRDERP2)
(1114:27252:T:F:F:F:False:N:dbprderp::DBPRDERP2)
(2224:60978:T:F:F:F:False:N:dbprderp::DBPRDERP2)
(1538:21604:T:F:F:F:False:N:dbprderp::DBPRDERP2)
(479:11529:T:F:F:F:False:N:dbprderp::DBPRDERP2)
(1976:44857:T:F:F:F:False:N:dbprderp::DBPRDERP2)
(1112:54965:T:F:F:F:False:N:dbprderp::DBPRDERP2)
(2179:58272:T:F:F:F:False:N:dbprderp::DBPRDERP2)
(1807:694:T:F:F:F:False:N:dbprderp::DBPRDERP2)
(227:33403:T:F:F:F:False:N:dbprderp::DBPRDERP2)
(675:44164:T:F:F:F:False:N:dbprderp::DBPRDERP2)
(483:56542:T:F:F:F:False:N:dbprderp::DBPRDERP2)
(1545:33881:T:F:F:F:False:N:dbprderp::DBPRDERP2)
(1742:14104:T:F:F:F:False:N:dbprderp::DBPRDERP2)
(20:46000:T:F:F:F:False:N:dbprderp::DBPRDERP2)

(CP)   queue (count:0) : 
get::timeout::2
2024-09-26 18:45:38.475995 TID:20  (PRI) (ENT) OracleException.ctor()
2024-09-26 18:45:38.476005 TID:20  (PRI) (ENT) OracleError.ctor()
2024-09-26 18:45:38.476016 TID:20  (PRI) (EXT) OracleError.ctor()
2024-09-26 18:45:38.476021 TID:20  (PRI) (ENT) OracleError.ctor()
2024-09-26 18:45:38.476024 TID:20  (PRI) (EXT) OracleError.ctor()
2024-09-26 18:45:38.476028 TID:20  (PRI) (EXT) OracleException.ctor()
2024-09-26 18:45:38.476146 TID:20  (PRI) (EXT) (CP) PoolManager.Get() (txnid=n/a) PM.Get(aff=;force=F) returning (null)
2024-09-26 18:45:38.476284 TID:20  (PRI) (ENT) OracleException.HandleError() from OraclePoolManager.Get()
2024-09-26 18:45:38.476544 TID:20  (PRI) (ERR) (CP) OracleException.HandleError() from OraclePoolManager.Get()(txnid=n/a) Oracle.ManagedDataAccess.Client.OracleException (0x80004005): ORA-50000: Connection request timed out
   at OracleInternal.ConnectionPool.PoolManager`3.GetAsync(ConnectionString csWithDiffOrNewPassw, Boolean bGetForApp, OracleConnection connRefForCriteria, String affinityInstanceName, Boolean bForceMatch, Boolean bAsync)
   at OracleInternal.ConnectionPool.OraclePoolManager.GetAsync(ConnectionString csWithNewPassw, Boolean bGetForApp, OracleConnection connRefForCriteria, String affinityInstanceName, Boolean bForceMatch, Boolean bAsync)
2024-09-26 18:45:38.476557 TID:20  (PRI) (EXT) OracleException.HandleError() from OraclePoolManager.Get()
2024-09-26 18:45:38.476603 TID:20  (PRI) (EXT) (CP) OraclePoolManager.Get() (txnid=n/a) 
2024-09-26 18:45:38.476720 TID:20  (PRI) (ENT) OracleException.HandleError() from OracleConnectionDispenser.Get()
2024-09-26 18:45:38.476964 TID:20  (PRI) (ERR) (CP) OracleException.HandleError() from OracleConnectionDispenser.Get()(txnid=n/a) Oracle.ManagedDataAccess.Client.OracleException (0x80004005): ORA-50000: Connection request timed out
   at OracleInternal.ConnectionPool.PoolManager`3.GetAsync(ConnectionString csWithDiffOrNewPassw, Boolean bGetForApp, OracleConnection connRefForCriteria, String affinityInstanceName, Boolean bForceMatch, Boolean bAsync)
   at OracleInternal.ConnectionPool.OraclePoolManager.GetAsync(ConnectionString csWithNewPassw, Boolean bGetForApp, OracleConnection connRefForCriteria, String affinityInstanceName, Boolean bForceMatch, Boolean bAsync)
   at OracleInternal.ConnectionPool.OracleConnectionDispenser`3.GetAsync(ConnectionString cs, PM conPM, ConnectionString pmCS, OracleOpaqueString securedPassw, OracleOpaqueString securedProxyPassw, OracleConnection connRefForCriteria, Boolean bAsync)
2024-09-26 18:45:38.476975 TID:20  (PRI) (EXT) OracleException.HandleError() from OracleConnectionDispenser.Get()
2024-09-26 18:45:38.477004 TID:20  (PRI) (EXT) (CP) OracleConnectionDispenser.Get() (txnid=n/a) 
2024-09-26 18:45:38.477047 TID:20  (PRI) (ENT) OracleException.HandleError() from OracleConnection.OpenInternal()
2024-09-26 18:45:38.477338 TID:20  (PUB) (ERR) OracleException.HandleError() from OracleConnection.OpenInternal()(txnid=n/a) Oracle.ManagedDataAccess.Client.OracleException (0x80004005): ORA-50000: Connection request timed out
   at OracleInternal.ConnectionPool.PoolManager`3.GetAsync(ConnectionString csWithDiffOrNewPassw, Boolean bGetForApp, OracleConnection connRefForCriteria, String affinityInstanceName, Boolean bForceMatch, Boolean bAsync)
   at OracleInternal.ConnectionPool.OraclePoolManager.GetAsync(ConnectionString csWithNewPassw, Boolean bGetForApp, OracleConnection connRefForCriteria, String affinityInstanceName, Boolean bForceMatch, Boolean bAsync)
   at OracleInternal.ConnectionPool.OracleConnectionDispenser`3.GetAsync(ConnectionString cs, PM conPM, ConnectionString pmCS, OracleOpaqueString securedPassw, OracleOpaqueString securedProxyPassw, OracleConnection connRefForCriteria, Boolean bAsync)
   at Oracle.ManagedDataAccess.Client.OracleConnectionInternal.OpenInternalAsync(Boolean bAsync)
2024-09-26 18:45:38.477369 TID:20  (PRI) (EXT) OracleException.HandleError() from OracleConnection.OpenInternal()
2024-09-26 18:45:38.477411 TID:20  (PUB) (EXT) OracleConnection.OpenInternal() (conid=17882254) (state=Connecting) (sessid=0) (implid=0) (pooling=T) (txnid=n/a) 
2024-09-26 18:45:38.477488 TID:20  (PUB)       Task Faulted for OracleConnectionHashCode: 17882254
2024-09-26 18:45:38.477924 TID:20  (PUB)       Set Exception on Task: Oracle.ManagedDataAccess.Client.OracleException (0x80004005): ORA-50000: Connection request timed out
   at OracleInternal.ConnectionPool.PoolManager`3.GetAsync(ConnectionString csWithDiffOrNewPassw, Boolean bGetForApp, OracleConnection connRefForCriteria, String affinityInstanceName, Boolean bForceMatch, Boolean bAsync)
   at OracleInternal.ConnectionPool.OraclePoolManager.GetAsync(ConnectionString csWithNewPassw, Boolean bGetForApp, OracleConnection connRefForCriteria, String affinityInstanceName, Boolean bForceMatch, Boolean bAsync)
   at OracleInternal.ConnectionPool.OracleConnectionDispenser`3.GetAsync(ConnectionString cs, PM conPM, ConnectionString pmCS, OracleOpaqueString securedPassw, OracleOpaqueString securedProxyPassw, OracleConnection connRefForCriteria, Boolean bAsync)
   at Oracle.ManagedDataAccess.Client.OracleConnectionInternal.OpenInternalAsync(Boolean bAsync)
   at Oracle.ManagedDataAccess.Client.OracleConnectionInternal.OpenAsyncHelper(CancellationToken cancellationToken)
   at Oracle.ManagedDataAccess.Client.OracleConnectionInternal.OpenAsyncHelper(CancellationToken cancellationToken)

how to enable this logging

@WalterDias
Copy link
Author

how to enable this logging

@CavidH to enable the the log just use the following statments:

OracleConfiguration.OracleDataSources.Add("oracle", "<YOUR_DATA_SOURCE>");

// Set tracing options
OracleConfiguration.TraceFileLocation = @"/tmp/oracle";
OracleConfiguration.TraceLevel = 7;

I'm base my configuration on this: https://github.com/oracle/dotnet-db-samples/blob/master/samples/configuration-api/configuration-class.cs

@alexkeh indeed the options is was 1 change to 0 as sugested, and here is the log

WEBTESTCONNECTION.DLL_PID_2615_DATE_2024_09_27_TIME_09_28_02_279799.zip

@CavidH
Copy link

CavidH commented Sep 27, 2024

@alexkeh indeed the options is was 1 change to 0 as sugested, and here is the log

This has been very helpful to me, thank you. I should mention that I encountered this error while load testing my services (after async migration). I think that opening async connections happens too quickly and the Oracle TNS listener can't handle it. Maybe I’m wrong :)

@alexkeh
Copy link
Member

alexkeh commented Sep 27, 2024

Thanks @WalterDias! We'll review the trace.

@CavidH We'll see what the trace has. You can set an Oracle listener connection rate limit if you need to slow down the requests coming into the listener so that the DB is not overwhelmed by connection requests if that is the bottleneck.

@CavidH
Copy link

CavidH commented Sep 27, 2024

@CavidH We'll see what the trace has. You can set an Oracle listener connection rate limit if you need to slow down the requests coming into the listener so that the DB is not overwhelmed by connection requests if that is the bottleneck.

How exactly will it limit the connections? Will requests exceeding the limit wait, or will it throw an exception?

@alexkeh
Copy link
Member

alexkeh commented Sep 27, 2024

Yes, the listener will prevent more than defined rate of connections from coming in. If those connections have to wait beyond the timeout value they will have connection timeout error.

If it's a problem, consider solutions similar to dealing with any other connection storms. Figure out where your bottleneck is and increase capacity for that bottleneck. For example, if the DB requires more CPUs to handle a load surge, increase CPUs in real time, which can be done with Oracle Autonomous Database.

Another alternative is to spread the load so that connection opening isn't concentrated at one time. For example, if you know that all your users log in at 9 AM in the morning, create a larger pool of connections ready to dispense before 9 AM so that fewer new connections need to be created. This can be done with a high Min Pool Size and with a larger Incr Pool Size so that your pool can grow fast.

@CavidH
Copy link

CavidH commented Sep 27, 2024

Yes, the listener will prevent more than defined rate of connections from coming in. If those connections have to wait beyond the timeout value they will have connection timeout error.

If it's a problem, consider solutions similar to dealing with any other connection storms. Figure out where your bottleneck is and increase capacity for that bottleneck. For example, if the DB requires more CPUs to handle a load surge, increase CPUs in real time, which can be done with Oracle Autonomous Database.

Another alternative is to spread the load so that connection opening isn't concentrated at one time. For example, if you know that all your users log in at 9 AM in the morning, create a larger pool of connections ready to dispense before 9 AM so that fewer new connections need to be created. This can be done with a high Min Pool Size and with a larger Incr Pool Size so that your pool can grow fast.

Thank you for the valuable information

@WalterDias
Copy link
Author

Thanks @WalterDias! We'll review the trace.

@CavidH We'll see what the trace has. You can set an Oracle listener connection rate limit if you need to slow down the requests coming into the listener so that the DB is not overwhelmed by connection requests if that is the bottleneck.

Thanks for the support

@WalterDias
Copy link
Author

@alexkeh any thoughts about the problem?

@alexkeh
Copy link
Member

alexkeh commented Sep 30, 2024

@WalterDias I haven't heard back from my dev team yet. We are planning to ship ODP.NET 23.6 soon. So, they are preoccupied with getting that release out.

@alexkeh
Copy link
Member

alexkeh commented Oct 7, 2024

We've been able to reproduce this issue in house and are analyzing it further. Bug 37144163 has been filed to track the issue and eventual resolution.

@WalterDias
Copy link
Author

Thx for the support @alexkeh, I perform the same tests with 23.6 and de issue still there. I you guys need any further test let me know.

At.
Walter Dias

@alexkeh
Copy link
Member

alexkeh commented Oct 7, 2024

@WalterDias
After some investigation, the problem appears to be not enough threads available in the T=thread pool to finish opening connections when many are being requested near-simultaneously. Can add the following code change before opening a connection to see if it resolves the timeout issue?

ThreadPool.SetMinThreads(512, 10);

This will create 512 worker threads available for creating the connections. You can adjust this level up or down based on the number of users you're creating in rapid succession.

In our tests, when we increased the thread pool, it resolved the timeout problem. If it does the same for you, then we can be fairly sure the lack of threads is the problem in your case as well.

@WalterDias
Copy link
Author

WalterDias commented Oct 8, 2024

Hi @alexkeh I did the test and work fine, but as you can se on the video it took 7 seconds to response start to come (#412).

My only concern to change this value from the standard one was the unknown impact mentioned on https://learn.microsoft.com/en-us/dotnet/api/system.threading.threadpool.setminthreads?view=net-8.0

image

2024-10-08.10-48-18.mp4

feel free to reach out to any further information all the tests were performed at version 23.6.0, driver log from tests.

This file is compress with 7z but since github only allows .zip was compressed 2x lol Uploading trace.7z.zip…

@alexkeh
Copy link
Member

alexkeh commented Oct 8, 2024

@WalterDias That's interesting! Thanks for trying it out.

The link to the 7z.zip isn't working. Can you fix it? Or you can email the trace to dotnet_us(at)oracle.com if it's not too big to send via email.

@WalterDias
Copy link
Author

@alexkeh the file was zip and inside has a 7z file.

@alexkeh
Copy link
Member

alexkeh commented Oct 8, 2024

@WalterDias The link itself is not valid. There is no download for me to look at. The link leads to an invalid GitHub page: https://github.com/oracle/dotnet-db-samples/issues/Logs

@alexkeh
Copy link
Member

alexkeh commented Oct 8, 2024

@WalterDias I reviewed your trace. The overall test run took just over 11 seconds using ODP.NET. As I scroll through the trace, I don't see any single operation in which there is 7 second delay as in issue #412. In that issue, we see that as ODP.NET tries the DB IP takes over 7 seconds. For the same operation, ODP.NET executes quickly:

2024-10-08 13:42:34.613154 TID:245 (NET)       (CID2) Trying (host=10.97.0.160) (port=1521)
2024-10-08 13:42:34.613203 TID:245 (NET) (EXT) (CID2) TcpTransportAdapter.Connect()
2024-10-08 13:42:34.613178 TID:246 (NET)       (CID5) Trying (host=10.97.0.160) (port=1521)
2024-10-08 13:42:34.613198 TID:246 (NET) (EXT) (CID5) TcpTransportAdapter.Connect()

@WalterDias
Copy link
Author

Okay @alexkeh I will add SetMinThreads as workaround as suggested and await for the fix.

I did this stress test to simulate what happen eventualy in our production environment, since we are a retail if more than 80 stores and a online marketplace so we have some bursts.

@alexkeh alexkeh added this to the ODP.NET 23.6.1 milestone Oct 16, 2024
@alexkeh
Copy link
Member

alexkeh commented Oct 16, 2024

There will be a new release (23.6.1) that will fix this bug.

@mihaimyh
Copy link

mihaimyh commented Nov 5, 2024

We are using Oracle.EntityFrameworkCore and experiencing the same problem.

In this case ThreadPool.SetMinThreads(512, 10); cannot be used since the connection opening is handled by the EF Core DbContext, isn't it?

@alexkeh
Copy link
Member

alexkeh commented Nov 5, 2024

We should have ODP.NET 23.6.1 available shortly to likely resolve this scalability issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants