I need to perform extensive data inserts to my database. I can implement the code the multithreaded way with a throttled scheduler that limits the number of concurrent operations. On every M
rows, a block is formed and inserted into the database as an atomic operation. Multiple concurrent operations shall occur because database is slower than reading and parsing a data file. I often implement this model using multithreading.
If instead I decide to implement my code using await/async (Entity Framework supports asynchronous programming), how can I make sure that no more than N concurrent tasks execute (i.e. go to database) at the same time?
In my initial design, I have instantiated a List<Task>
, added new tasks as soon as I read a block of data to be inserted atomically, and then have let my method return after await
ing all of the task. The design-time issue is that the number of concurrent Task
s (and thus memory footprint) are going to explode because tasks are fed faster than they complete for big data files.
I was thinking about using a SemaphoreSlim
, but I have little experience with asynchronous programming (unlike multithreaded). So I am asking this question to get feedback about best practices, if there are any.