0

I have some sample code.

        foreach (var currentFile in currentDirectory)
        {
            string[] contents = File.ReadAllText(currentFile).Split('%');
            using (XmlWriter writer = XmlWriter.Create(currentFile.Replace(".txt",".xml")))
            {
                writer.WriteStartDocument();
                writer.WriteStartElement("INFO");
                writer.WriteStartElement("INFO");
                writer.WriteElementString("USER", contents[1]);
                writer.WriteElementString("USERDATA", contents[2]);
                writer.WriteEndElement();
                writer.WriteEndElement();
                writer.WriteEndDocument();
            }
        }

I have text file with text like that: MYNICKNAME % A LOT OF DATA. Now i am trying to make this programm multithreaded. What direction should i dig? async/await? i am really new to multithreading.

  • We might need a little bit more info on what you are trying to do. Do you just need to make a single pass over all of the files in the directory and turn each one from text into xml? One thing to take into consideration is whether or not your are the ONLY one accessing these files. – CM0491 Mar 24 '16 at 17:43
  • Parallelism will only benefit the CPU-intensive portion of your process. It looks like a lot of your process is I/O which will not benefit from parallelism. So processing 3 files in the same time it takes to process 1 is not a realistic goal. – D Stanley Mar 24 '16 at 17:46
  • Yes, I pass directory with file(s) and then processing them and making xml file. So each textfile in directory will have a xml. Im trying to make it faster like in 2-3 times by multithreding –  Mar 24 '16 at 17:46
  • 6
    In this case, being disk I/O bound, having parallel tasks might even make things worse. – Luc Morin Mar 24 '16 at 17:46
  • And `async/await` will allow you to "fire and forget", so you don't have to wait for everything to complete before moving on, but it will not make the process faster overall. – D Stanley Mar 24 '16 at 17:50
  • I think I agree with Luc Morin. This likely will be I/O bound and separating each file-write-unit-of-work into separate tasks is just going to have a sequence of tasks that are all waiting for the I/O bus. However, (this may be overkill for your purposes), you could get a significant speedup if you spread the work over multiple compute nodes on separate machines if you have access to HPC. – CM0491 Mar 24 '16 at 17:51
  • There is no CPU-bound work? –  Mar 24 '16 at 18:06
  • There is, but given how pathetic the IO performance of hard discs is - particularly when multiple parallel operations happen - compared the CPU performance you need a LOT of processing to make the CPU an issue. Most people totally ignore how really IO is compared to CPU performance. Unless you run a Raid 10 of SSD, obviously. – TomTom Mar 24 '16 at 18:28

1 Answers1

1

You could change the foreach to Parallel.ForEach.

Sample: How to: Write a Simple Parallel.ForEach Loop

And a similar question: Read and process files in parallel C#

Community
  • 1
  • 1
Chris
  • 2,009
  • 1
  • 16
  • 25
  • 1
    Parallel.ForEach doesnt make it faster on small amout of files. it's better on big amout of files –  Mar 24 '16 at 17:47
  • Multithreading add a overhead, so this is normal. And as `Luc Morin` has commented: You are reading files from a disk. So your threads are competing to read from the same disk. Multithreading isn't something magical that will cut the time of your work by the number of cpus you have. There are other factors involved. – Chris Mar 24 '16 at 18:20
  • There is no CPU-bound work to make it faster, right? –  Mar 24 '16 at 18:21
  • I've added a link to a similar question. – Chris Mar 24 '16 at 18:26