I have to implement an algorithm on data which is (for good reasons) stored inside SQL server. The algorithm does not fit SQL very well, so I would like to implement it as a CLR function or procedure. Here's what I want to do:
Execute several queries (usually 20-50, but up to 100-200) which all have the form
select a,b,... from some_table order by xyz
. There's an index which fits that query, so the result should be available more or less without any calculation.Consume the results step by step. The exact stepping depends on the results, so it's not exactly predictable.
Aggregate some result by stepping over the results. I will only consume the first parts of the results, but cannot predict how much I will need. The stop criteria depends on some threshold inside the algorithm.
My idea was to open several SqlDataReader, but I have two problems with that solution:
You can have only one SqlDataReader per connection and inside a CLR method I have only one connection - as far as I understand.
I don't know how to tell SqlDataReader how to read data in chunks. I could not find documentation how SqlDataReader is supposed to behave. As far as I understand, it's preparing the whole result set and would load the whole result into memory. Even if I would consume only a small part of it.
Any hint how to solve that as a CLR method? Or is there a more low level interface to SQL server which is more suitable for my problem?
Update: I should have made two points more explicit:
I'm talking about big data sets, so a query might result in 1 mio records, but my algorithm would consume only the first 100-200 ones. But as I said before: I don't know the exact number beforehand.
I'm aware that SQL might not be the best choice for that kind of algorithm. But due to other constraints it has to be a SQL server. So I'm looking for the best possible solution.