0

I working with threads and i have huge amount of data in my database. I am working with entity framework. When assigning a select to the database, I need to process the data as soon as it arrives. And I have to repeat this process until the data is gone. As soon as the data comes(like reader.ReadLine()), I want it to run on threads and keep the data coming.

For example there is 1 million data in the database. 10 seconds for me to get all the data. 10 seconds to process them. But if I process the data as it comes, I think it will take less than 20 seconds and I can save time.

How can I do this?

Cycerman
  • 26
  • 4
  • 1M rows isn't a lot of data. But reading 1M rows and trying to process it on the client suggests some really, really inefficient coding. Apart from that, you haven't explained what the problem is or posted any code. There's no `ReadLine` in a DbDataReader – Panagiotis Kanavos Jun 04 '21 at 13:56
  • What are you trying to do? Impot a file into the database? Extract query results to a file? Process and update the table rows? EF isn't needed in *any* of these cases – Panagiotis Kanavos Jun 04 '21 at 13:59
  • Are you able to express the data that are retrieved from the database as an `IEnumerable`? If yes, then the PLINQ library is a low-entry solution to this problem. You can see an example [here](https://stackoverflow.com/questions/62656615/how-to-enforce-a-sequence-of-ordered-execution-in-parallel-for/62662369#62662369). For more flexibility and power you can look at the [TPL Dataflow](https://docs.microsoft.com/en-us/dotnet/standard/parallel-programming/dataflow-task-parallel-library) library, but you'll have to study it for a day or two before being able to be productive with it. – Theodor Zoulias Jun 04 '21 at 14:08
  • Actually, 1m data is just an example, what I'm trying to do is producer/consumer. Let me explain what I want to do: Worker threads are waiting to pull the records from the queue and process, I want to send them to the queue as they fetch from the db so that they do not wait during the select. For example, the total fetch time for 1m records is 1 minute, but when I fetch it line by line, I want to queue when 1000 records arrive, so that I can send records to the queue before the select is completed. – Cycerman Jun 04 '21 at 15:02
  • The TPL Dataflow is the ultimate tool for implementing the producer/consumer pattern in .NET. [Here](https://stackoverflow.com/questions/62602684/c-sharp-process-files-concurrently-and-asynchronously/62613098#62613098) is an example of using this library, and [here](https://stackoverflow.com/questions/60929044/c-sharp-parallel-foreach-memory-usage-keeps-growing/60930992#60930992) is another one. You could try to use this library to solve your problem, and if you get stuck then you can post the code of your incomplete attempt, and ask for specific help. – Theodor Zoulias Jun 04 '21 at 18:22
  • Do you have real timings on the loading and processing of your data. You say 10 seconds and 10 seconds respectively, but they sound suspiciously made up. – Enigmativity Jun 05 '21 at 07:07

0 Answers0