2

I have a C# .NET Framework application that calls an unmanaged C++ DLL via DLLImport, to parse a bunch of files given by the user and perform some operations.

I would like to have a progress bar on my C# application, as this file parsing might be a long process. The best way I have thought to do this is to parse a couple of files at a time, and then return to the C# code so I can update the progress bar.

However, this requires me to allocate some memory on the heap of the C++ DLL so I don't have to pass all the 10,000+ file paths as arguments each time I have to call the C++ function again.

I am not sure if closing the DLL after I have completed my parsing is possible. Is there any other way I can accomplish this task without having to keep all of the paths in memory allocated for the entirety of the time that the application is running?

Remy Lebeau
  • 555,201
  • 31
  • 458
  • 770
brickbobed
  • 169
  • 1
  • 10
  • 3
    I'm so confused by this description. Please show your actual code, and explain what you are having trouble with in it. Before you added the progress bar, how were you tracking the files needing to be processed? Why would that change at all just because you added a progress bar? As for closing the C+ DLL, see [Unload a DLL loaded using DllImport](https://stackoverflow.com/questions/2445536/) – Remy Lebeau Jan 14 '21 at 22:24
  • 1
    Usually when P/Invoking you provide a function to allocate a resource and another to deallocate if it needs to hold a resource. Or the called code should clean up after itself before returning. – Alejandro Jan 14 '21 at 22:24
  • Where are you loading the paths from anyway, Can you not just lazy load each one and pass through to the DLL when ready? – Charlieface Jan 14 '21 at 22:24
  • 1
    For a progress bar however, the best thing you can do is to provide a callback to the unmanaged code where the managed side updates the display acordingly. – Alejandro Jan 14 '21 at 22:25
  • @brickbobed There are a few separate issues here: 1) updating progress bar; 2) unloading the unmanaged DLL; 3) breaking the large batch into smaller pieces. Does your application require to process entire 10k set of files at once, or are those files independent? To unload the DLL from memory, use FreeLibrary or similar. As long as your code doesn't hold references into unmanaged memory, it's usually safe. – Roman Polunin Jan 14 '21 at 22:26
  • The files are independent and the paths are actually retrieved from a file on hard drive. Retrieving them takes 100ms or so though, so I don't want to do this every time I update the progress bar. I'm assuming passing the paths back and forth as arguments is just as inefficient. – brickbobed Jan 14 '21 at 22:30

1 Answers1

3

Since you've confirmed that files are independent, you should try parallelize the operation.

On the producer side of the process, set up a List or an array with file names. If the list gets too big (really, REALLY big so it stresses available RAM), you may want to replace it with a combination of BlockingCollection and a Queue, so producer can be throttled until the number of items in the processing pipe is below threshold.

On the other side of this pipe, start a worker (inline or as separate worker thread) which will either sequentially or .AsParallel() read items from the collection, and pass them over to the unmanaged library for processing.

Since this is a UI application, I guess your worker thread should be separate. Upon each processed file, do an .Invoke() on the UI form to update the progress bar.

Roman Polunin
  • 53
  • 3
  • 12
  • So in case somebody's wondering why this is an answer; once parallized each file name only has to get sent over the managed -> native transition once. – Joshua Jan 14 '21 at 22:40
  • Parallelization is not necessary though; the key is to only send one file per call from a separate worker thread, and inform the UI thread via .Invoke(). – Roman Polunin Jan 14 '21 at 22:41