0

I have a large collection of elements. I want to call ToString for each element and build one string. My first approach was to slow

        string str = "";
        list.ForEach( g => {

            string s = g.ToString();
            if(s != "")
                str = str + g.ToString() + "\n";
        }); 

I tried using the Parallel class and PLINQ as shown below but then the order of the elements in the final string was not like in the original.

Parallel

        System.Threading.Tasks.Parallel.ForEach(list, g => {

            string s = g.ToString();
            if(s != "")
                str = str + g.ToString() + "\n";
        });

PLINQ

        string str = "";
        list.AsParallel().AsOrdered().ForAll( g => {

            string s = g.ToString();
            if(s != "")
                str = str + g.ToString() + "\n";
        });

How can I improve the performance and keep the original order? Thanks

gerstla
  • 587
  • 1
  • 5
  • 20
  • `AsOrdered()` only perserves the order with regard to the original ordering. So if the order is different after the ForEach then you need to sort it afterwards. – Zache Jan 23 '14 at 09:22
  • 3
    I don't know how to keep the order (or even if it is possible), but you should use a StringBuilder instead of concatenating strings. – krimog Jan 23 '14 at 09:24
  • changed to use Stringbuilder with no parallel and it solved the problem. thanks all – gerstla Jan 23 '14 at 09:58

1 Answers1

0

I think that trying to use parallelism is not the right solution here, using a better algorithm is.

Currently, your code is O(n2), because each concatenation creates a completely new string and so it has to copy the whole previous string into the new one.

But you can do this om O(n), if you use the mutable StringBuilder instead of the immutable string. That way, you can just append at the end of the existing StringBuilder and you don't have to copy anything.

And you can also achieve the same performance with less code using a special method just for joining strings together: string.Join().


But there might be an even better solution: use a StreamWriter. For example, if you wanted to write the resulting string to a file, that would be a better solution, because the whole string doesn't need to be in memory at all.


Parallelism isn't something that will magically solve all your performance problems. You need to think about what the threads will be doing, especially if you have some shared data (like str in your code). In your case, you could try to use parallelism, but it wouldn't be that simple and I'm not sure it would actually improve the performance.

It would work like this: each thread would get a range of indexes in the list and concatenate those. At the end, the main thread would then concatenate the results from all threads together. But there is no built-in method for that, so you would need to write all that code by yourself. (This would work, because string concatenation is associative.)

Community
  • 1
  • 1
svick
  • 236,525
  • 50
  • 385
  • 514