I have a VB.NET program using .NET Framework 3.5 and Oracle.DataAccess 19.1 that connects to a Oracle remote database server 11g (pooling=false) to query data.
The program running fine but once in a blue moon (not fix timing, can be as soon as 2 weeks or few month later), the program just hang. After few incidents (adding log after every incidents), we're able to nail down and identify the line of program which causing the hang, which is oracleDataAdapter.Fill(dataset) as shown below code snippet.
Private Function FetchData(connection As OracleConnection, str_SQL As String, dataset As DataSet) As Integer
Dim oracleDataAdapter As New OracleDataAdapter(connection.CreateCommand())
Try
If connection.State <> ConnectionState.Open Then
connection.Open()
End If
oracleDataAdapter.SelectCommand.AddToStatementCache = False
oracleDataAdapter.SelectCommand.NotificationAutoEnlist = False
oracleDataAdapter.SelectCommand.CommandTimeout = 60
oracleDataAdapter.SelectCommand.CommandText = str_SQL
Dim int_Count As Integer = oracleDataAdapter.Fill(dataset)
Return int_Count
Catch ex As Exception
LogError(ex)
Finally
oracleDataAdapter.Dispose()
End Try
End Function
Even thought we're able to identify the code that hang, but we have no idea why is hang. We did try to break the query by invoking OracleCommand.Cancel but didn't helped.
We though of network issue, thus we try to simulate oracle client and db server communication packet loss by unplugging the network cable at the app server during executing of the query and plug it back few second later. In some cases, the program just hang at the line of data fetching infinitely (as highlighted above code snippet).
We did same test by pointing database to XE 18c. We can't simulate the above hang issue but rather ORA 03113 error was thrown when unplugging and plunging back of network cable during executing query (usually it happened within 60 second). This is still ok as the program able to continue instead of hanging and error handling can do a retry and resume gracefully.
ADDED Aug 7 We developed program (in .NET) to try to reproduce the hang issue. The program just keep fetching data from database. In normal execution, the query will take ~4 seconds to fetch ~140,000 records. The hang able to simulate 4 out of 15 times (~30%).
Below the steps to reproduce the hang
- Run the application. The app just keep querying data from the database server. The function as show below.
- Observe the query execution, at ~ 2nd or 3rd second (based on the log display on the screen), unplug the network cable
- Plug back the network cable at ~15th second
- The program occasionally just hang at the line
oracleDataAdapter.Fill(dataset)
We tested against different version of unmanaged and managed Oracle Client 11, 12 and 19 and database server 11 XE and 18 XE, hang issue able to reproduced.
Is this the behavior or the Oracle Client? Any suggestion to break the execution? Or any suggestion to handle this behavior?