My .NET application receives a stream of data, which I need to dump into a few tables in an Oracle database. I have an internal queue which queues the data and a few threads reading from the queue converting it into the corresponding insert statements. I am receiving about maybe 10 data items per second, but there can be bursts of over 100 data items per second.
My first approach was to take each data item, convert it into the corresponding insert statements, and execute it one by one. However, this turned out to be too slow because each insert required a round trip to the database.
My next approach was to batch the inserts together into groups of up to 50, depending on how many items were in the queue, then wrap them into a begin-end block, and then push it into the database, like this:
begin
insert into MyTable (col1, col2, col3...) values (123, 'data1', 'data2', ...);
insert into MyTable (col1, col2, col3...) values (456, 'dataX', 'dataY', ...);
insert into MyTable (col1, col2, col3...) values (789, 'dataA', 'dataB', ...);
-- variable number inserts...
end;
This improved performance significantly and I was happy. Then our Oracle guy came to me and told me I was killing his database because I was sending mass amounts of unique SQL statements, which Oracle had to parse and cache each time. Eventually, Oracle crashed. The suggestion was to use always use the same SQL statements with bind variables so that it won't have to be parsed each time.
However, that would bring me back to the problem I first had, which is, I would have to run each insert statement one at a time, with bind variables, so that the statements are the same.
insert into MyTable (col1, col2, col3...) values (:val1, :val2, :val3, ...);
I can try to combine multiple inserts into a begin-end block, but that would result in the problem that SQL statements are all unique.
begin
insert into MyTable (col1, col2, col3...) values (:val11, :val12, :val13, ...);
insert into MyTable (col1, col2, col3...) values (:val21, :val22, :val23, ...);
insert into MyTable (col1, col2, col3...) values (:val31, :val32, :val33, ...);
...
end;
What should I be doing? Inserting statements one by one with bind variables, but with a huge number of threads? Should I be writing them to a CSV file and using SQL Loader? But how would I deal with CLOB columns? Should the inserts be wrapped in a stored procedure instead, and then use my batching approach?
I feel that this must be a very common problem, and there must be some sort of best practices for this situation.