3

I have created a .net application for optimizing the pdf files.Actually I have to optimize many files and i have called a thread like this:

CheckForIllegalCrossThreadCalls = false;
thOptimize = new Thread(csCommon.pdfFilesCompressAndMove);
thOptimize.Start();

Also I have found the no. of processors and cores using this:

int processors=Environment.ProcessorCount
int coreCount = 0;
foreach (var item in new System.Management.ManagementObjectSearcher("Select * from  Win32_Processor").Get())
{
coreCount += int.Parse(item["NumberOfCores"].ToString());
}

I have found 4 processors and 2 cores in my machine.

Now my problem is that i want to use the functionpdfFilesCompressAndMove for all the processors i.e. i want to optimize multiple files at the same time.In other words,I want to keep busy all the processors in optimization.

Please guide me how is it possible?

leppie
  • 115,091
  • 17
  • 196
  • 297
Kandpal Lalit
  • 139
  • 1
  • 5
  • 1
    FYI, Parallelizing cost might outweigh the benefits. Read up on the costs of parallelizing work [here](http://stackoverflow.com/questions/6036120/parallel-foreach-slower-than-foreach) – Wim Ombelets Dec 06 '12 at 08:24
  • 1
    Start a new thread for every single file. use multithreading concepts like task factory or parallel.foreach – varun257 Dec 06 '12 at 08:24

4 Answers4

1

What you want is a producer/consumer queue.

What happens here is that the producer creates work-items for the consumer to process. This works well when the producer can create the work for the consumer much faster than the consumer can process it. You then have one or more consumers processing this queue of work.

Here's a producer consumer class I use for this kind of thing:

public class ProducerConsumer<T>:IDisposable 
    {
        private  int _consumerThreads;
        private readonly Queue<T> _queue = new Queue<T>();
        private readonly object _queueLocker = new object();
        private readonly AutoResetEvent _queueWaitHandle = new AutoResetEvent(false);
        private readonly Action<T> _consumerAction;
        private readonly log4net.ILog _log4NetLogger = log4net.LogManager.GetLogger(System.Reflection.MethodBase.GetCurrentMethod().DeclaringType);
        private bool _isProcessing = true;

        public ProducerConsumer(Action<T> consumerAction,int consumerThreads,bool isStarted)
        {
            _consumerThreads = consumerThreads;

            if (consumerAction == null)
            {
                throw new ArgumentNullException("consumerAction");
            }
            _consumerAction = consumerAction;
            if (isStarted)
                Start();
            //just in case the config item is missing or is set to 0.  We don't want to have the queue build up
        }

        public ProducerConsumer(Action<T> consumerAction, int consumerThreads):this(consumerAction,consumerThreads,true)
        {


        }
        public void Dispose()
        {
            _isProcessing = false;
            lock(_queueLocker)
            {
                _queue.Clear();
            }
        }
        public void Start()
        {
            if (_consumerThreads == 0)
                _consumerThreads = 2;

            for (var loop = 0; loop < _consumerThreads; loop++)
                ThreadPool.QueueUserWorkItem(ConsumeItems);
        }

        public void Enqueue(T item)
        {
            lock (_queueLocker)
            {
                _queue.Enqueue(item);
                // After enqueuing the item, signal the consumer thread.            
                _queueWaitHandle.Set();
            }
        }

        private void ConsumeItems(object state)
        {
            while (_isProcessing)
            {
                try
                {
                    var nextItem = default(T);
                    bool doesItemExist;
                    lock (_queueLocker)
                    {
                        int queueCount = _queue.Count;
                        doesItemExist = queueCount > 0;
                        if (doesItemExist)
                        {
                            nextItem = _queue.Dequeue();
                        }
                        if (queueCount > 0 && queueCount % 50 == 0)
                            _log4NetLogger.Warn(String.Format("Queue is/has been growing.  Queue size now:{0}",
                                                              queueCount));
                    }
                    if (doesItemExist)
                    {
                        _consumerAction(nextItem);
                    }
                    else
                    {
                        _queueWaitHandle.WaitOne();
                    }
                }
                catch (Exception ex)
                {

                    _log4NetLogger.Error(ex);
                }

            }
        }
    }

It's a generic class, so T is the type of object you're giving to to process. You also provide it with an Action, which is the method that does the actual processing. This should allow you to process multiple PDF files at once in a clean way.

Faster Solutions
  • 7,005
  • 3
  • 29
  • 45
0

Check this thread: Optimal number of threads per core

If your thread method csCommon.pdfFilesCompressAndMove is very CPU-consuming (which I can guess by its name) you should start 1 thread per core. You better use ThreadPool.QueueUserWorkItem, than creating threads manually, it will take care of spawning threads between cores. In your case, as I understand, you have 8 cores, so you can call ThreadPool.QueueUserWorkItem(csCommon.pdfFilesCompressAndMove) 8 times, and call again when one of your threads finishes, mantaining total number of running threads equal to 8.

Community
  • 1
  • 1
Alexander Bortnik
  • 1,219
  • 14
  • 24
0

I would use ThreadPool, because as far as I know it is managed by the .NET Framework and the OS in a way that always creates the optimum number of threads for the target system.

Thorsten Dittmar
  • 55,956
  • 8
  • 91
  • 139
0

I think your best bet would be to start off with something simple which will allow you to understand the performance characteristics of your problem.

List<string> items = GetListOfPdfFilesToProcess();
int numCores = 4;
int maxListChunkSize = (int)Math.Ceiling(items.Count / (double)numCores);
ManualResetEvent[] events = new ManualResetEvent[numCores];

for (int i = 0; i < numCores; i++)
{
    ThreadPool.QueueUserWorkItem(ProcessFiles, new object[]
    {
        items.Skip(i * maxListChunkSize).Take(maxListChunkSize).ToList(), events[i]
    });
}

WaitHandle.WaitAll(events);

....

private static void ProcessFiles(object state)
{
    object[] stateArray = (object[])state;
    List<string> filePaths = (List<string>)stateArray[0];
    ManualResetEvent completeEvent = (ManualResetEvent)stateArray[1];

    for (int i = 0; i < filePaths.Count; i++)
    {
        csCommon.pdfFilesCompressAndMove(your parameters);
    }

    completeEvent.Set();
}

The main thing here is to split the work up into numCores chunks. This way you should be able to make good use of all CPU cores but keep a pretty simple programming model.

Keep in mind that this does not do any error handling - you'll want to take care of that. It might also pay to put some thought into what to do if csCommon.pdfFilesCompressAndMove fails to process a file. The simplest approach would be to log the error and inspect it later, though you could attempt to reprocess the file if you felt it would succeed the next time.

You'll notice that the state object is just an array; if you need to pass in lots of parameters to ProcessFiles then it may be simpler to wrap those parameters up into a single object and pass that in as the state.

Edit:

To use from your Tick event:

private void TimerTick(object sender, EventArgs e)
{
    //Disabling the timer will ensure the `TimerTick` method will not try to run
    //while we are processing the files. This covers the case where processing takes
    //longer than 2 minutes.
    timer.Enabled = false;

    //Run the first block of code in my answer.

    //Reenabling the timer will start the polling back up.
    timer.Enabled = true;
}

I would also recommend checking the number of files you have to process: if there are none, reenable the timer and return. This will avoid queuing up a bunch of operations that don't actually do anything.

nick_w
  • 14,758
  • 3
  • 51
  • 71
  • Thank you nick_w.I have implemented this code with my code.I am refreshing the application to check the new files in the folder after every 2 minutes using timer.Then how will i call the above method in my timer_tick event? – Kandpal Lalit Dec 07 '12 at 08:25