3

I just finished up my most complex and feature-laden WinForms application to date. It loads a list any number of HTML files, then loads the content of one, uses some RegEx to match some tags and remove or replace them (yes, yes, I've seen this. It works just fine, thanks Cthulu), then writes it to disk.

However, I noticed that ~200 files takes roughly 30 seconds to process, and after the first 5-10 seconds the program is reported as "Not Responding". I'm assuming it's not wise to do something like this guy did, as the hard drive is a bottleneck.

Perhaps it'd be possible to load as many as possible into memory, then process each one with a thread, write those, then load some more into memory?

At the very least, would creating a worker thread separate from the UI thread prevent the "Not Responding" issue? (This MSDN article covers what I was considering.)

I guess I'm asking if multithreading will offer any sort of speed improvement, and if so, what would be the best way of going about it?

Any help or advice is much appreciated!

Community
  • 1
  • 1
Omega192
  • 343
  • 1
  • 5
  • 15

5 Answers5

3

Yes, you should start by using a Backgroundworker to decouple your work from the GUI. Handling a GUI event should never take too much time. Aim for 20ms, not 20s.

Then as a bonus you could see if the processing (CPU intensive part) can be split into independent jobs and execute them as TPL Tasks.

There is insufficient information to say if or how you should do that.

H H
  • 263,252
  • 30
  • 330
  • 514
  • I did read of that 20ms rule in the MSDN article. This is the first time I've ever written an application that takes this long, so the concept of a Background worker is entirely new to me. I'll look into TPL, that does indeed sound great. Thanks! – Omega192 Jun 08 '11 at 14:43
  • Take a look at [this page](http://msdn.microsoft.com/en-us/library/system.componentmodel.backgroundworker.aspx), especially the if/else if/... in the Completed event. – H H Jun 08 '11 at 18:07
2

Threading jobs, tasks, etc. will, in most cases, prevent the primary, or main thread from becoming non-responsive. Do not create multiple threads for disk IO (obviously). I would dedicate a single worker thread to taking your files off a queue and processing the disk IO. Otherwise, 1 or 2 worker threads to do in-memory processing should be sufficient while your main thread can remain responsive.

IAbstract
  • 19,551
  • 15
  • 98
  • 146
2

First of all, if you want the program to remain responsive move the calculations to a separate thread (remove it from the UI thread).

The actual performance improve depends on the number of processors you have, not the number of threads.

So if you have P threads, you can divide the work to P work items and get some work improvement. (Amdahl's Law)

You can use BackgroundWorker to divide the work properly. : C# BackgroundWorker Tutorial

Yochai Timmer
  • 48,127
  • 24
  • 147
  • 185
0

Why not use StreamReader.ReadAllLines() to read each file into an array, and then process each element of the array?

MGZero
  • 5,812
  • 5
  • 29
  • 46
  • I'm currently using StreamReader.ReadToEnd() to read the file contents into a single string. Are you supposing that I should create a few threads and have each one work on a single element of aforementioned array instead? How would I synchronize the writing of the corrected elements to disk? – Omega192 Jun 08 '11 at 14:39
-1

If you do all your processing in the GUI-thread, your application will show the 'not responding' if it takes very long. In my opinion, you should try to never do (extensive) processing actions in the same thread as your GUI. In addition, you could even just create a thread for each file to be processed. This will most likeley speed things up, as long as the seperate threads do not need any data from eachother.

Vincent Koeman
  • 751
  • 3
  • 9