2

Our customer is having a very peculiar problem with our C#/ASP.NET application when they issue requests that access their Oracle XE 11g database. This happens via plain SQL over ADO.NET using the ODP.NET NuGet package.

Our application is hosted on a public server, while this particular database is hosted privately on customer's premises. I.e., we've been given a connection string with credentials and they've whitelisted our public IP in order for the application to see and communicate with their database.

The scheme works perfectly fine most of the time. There is, however, a certain set of queries (a single base query with slightly different parameters, actually) that cause a hang on the following line of C# code:

var reader = cmd.ExecuteReader(CommandBehavior.Default);

The SELECT is not trivial, but it isn't terribly complex, either:

SELECT "T00_".*
FROM (
 SELECT ROWNUM "__ROWNUM",
  "T0_".*
 FROM (
  SELECT "T1_"."FIELD1" "F1",
   "T2_"."FIELD2" "F2",
   "T3_"."FIELD3" "F3",
   "T4_"."FIELD4" "F4"
  FROM "TABLE1" "T1_"
  LEFT OUTER JOIN "TABLE2" "T2_" ON ("T1_"."KEY2" = "T2_"."KEY")
  LEFT OUTER JOIN "TABLE3" "T3_" ON ("T1_"."KEY3" = "T3_"."KEY")
  LEFT OUTER JOIN "TABLE4" "T4_" ON ("T1_"."KEY4" = "T4_"."KEY")
  ORDER BY "T1_"."KEY2",
   "T1_"."KEY3"
  ) "T0_"
 ) "T00_"
WHERE "T00_"."__ROWNUM" <= 50

There are several hundred rows in TABLE1 and the above SELECT returns 50 of them, as expected. But changing it only slightly, e.g. requesting 62 rows instead of 50, or adding a field from one of the JOIN-ed tables to ORDER BY, is enough to cause a hang. No error, no timeout - the calling thread simply hangs forever on the ExecuteReader line.

We've experimented quite a lot, but the changes that cause the hang seem to make no particular sense.

The customer claims their other (local) applications have no problems with this database server and therefore the problem must be in our application. I'm inclined to agree, because all queries run fine when executed via TOAD. They only hang when executed from our application. I haven't been able to connect the dots yet...

The confusing part is, our ASP.NET application successfully retrieves data for our customers from a number of connected database engines, including other Oracle versions and I've seen no such problems there.

Has anyone else experienced a similar behavior with Oracle XE via ADO.NET? What do you suggest I try on our end? Telling the customer to switch databases is not an option at the moment.

aoven
  • 2,248
  • 2
  • 25
  • 36
  • Have you checked the indexes that are being used? – JuanR Jan 03 '17 at 16:05
  • Can you post the execution plans? – Wernfried Domscheit Jan 03 '17 at 16:07
  • Also, it's possible you are getting reader locks. Are you able to run the reader with NOLOCK? Try that and see if it solves the problem. – JuanR Jan 03 '17 at 16:09
  • 2
    Inspecting the execution plan didn't reveal anything suspicious. The indexes were used as expected and TOAD managed to execute the same SELECT speedily and without a single problem. It turns out our application stops exhibiting the problem if I change the FetchSize property on the OracleCommand instance before calling ExecuteReader. The default value of 128 kb appears to be problematic, which leads me to believe we've found a bug in the ODP.NET library. Cannot be 100% sure yet. – aoven Jan 04 '17 at 15:46
  • I have this problem too with very large resultsets (25M records). When I set the FetchSize manually to 10000 it solves the problem. There is probably an optimal FetchSize for large queries, I will experiment some more and report on which size works best for me. – Martien de Jong Jun 26 '19 at 08:49

1 Answers1

0

There seems to be an issue with the default value of the FetchSize property on de OracleCommand class that results in a lock after using the ExecuteReader() function. Manually changing the FetchSize property solved this issue for me:

((OracleCommand)cmd).FetchSize = 1000000;

The value 1000000 enables me to write 100K records to a file in about 20 seconds. With lower FetchSize values (1000-10000) this increases to about 60-100 seconds.

Martien de Jong
  • 731
  • 1
  • 7
  • 19
  • 1
    FWIW, I never really solved this one, at least not by code. I've toyed with the FetchSize setting among many other things, yes. Unfortunately, after some time, the problem always came back, no matter what I did. In the end, we attributed the problem to that specific customer's combination of Oracle and network hardware, because it was the only one among many other Oracle-using customers that was experiencing these unpredictable hangs. Requesting an occasional restart of their Oracle database service was as good a solution as any wild configuration tweak I could come up in code. HTH. – aoven Jun 28 '19 at 12:04
  • For me it seems to act properly now. Have you looked at the version of the Oracle client libaries? I have had some weird issues with different versions of the Oracle client, it seems that very specific configurations are needed to be able to use integration with other software like .NET. If minor settings in the installation are different I get incomprehensible errors. – Martien de Jong Jun 30 '19 at 15:01
  • 1
    I'm using managed ODP.NET, which doesn't require Oracle Client installation and has no other special configuration beside the connection string and what the code specifies. I'm glad the change worked for you! – aoven Jun 30 '19 at 16:17
  • My apologies, I am using the Oracle Managed Data Access NuGet package, I was using the regular Data Access package in the past, which requires the Oracle client. I hadn't noticed that I don't need the Oracle drivers anymore. https://stackoverflow.com/questions/7819861/does-odp-net-require-oracle-client-installation/18204459 – Martien de Jong Jul 01 '19 at 08:05