1

I'm trying to populate a list with a million independently generated items. But after the loop I'm getting less items: 999960, 998534 etc.

As a bypass I've wrapped this code in another that checks generated amount and generates missing items.

I've also tried to sleep after the loop, but it doesn't give a result closer to desired.

var someList = new List<ListItem>();
Parallel.For(0, 1000000, _ =>
{
    var item = new ListItem();

    //some logic here

    someList.Add(item);
});

System.Console.WriteLine(someList.Count()); // returns less than 1000000

What is the reason of such behaviour and what is the proper way to solve this task?

Theodor Zoulias
  • 34,835
  • 7
  • 69
  • 104
Egor Shoba
  • 95
  • 1
  • 8
  • 3
    `List` is *not* thread-safe. You can try some collection from `System.Collections.Concurrent` or use *Parallel Linq*, e.g. `var someList = Enumerable.Range(1, 1000000).AsParallel().Select(_ => new ListItem()).ToList();` – Dmitry Bychenko Dec 05 '22 at 09:00
  • The term you are looking for is `race condition` and is solved with synchronization, e.g. locking. – CSharpie Dec 05 '22 at 09:00
  • 1
    Use system.collections.concurrent for thread safe collections, see also https://stackoverflow.com/questions/5605422/is-this-use-of-parallel-foreach-thread-safe – Eterm Dec 05 '22 at 09:01
  • Does this answer your question? [Is this use of Parallel.ForEach() thread safe?](https://stackoverflow.com/questions/5605422/is-this-use-of-parallel-foreach-thread-safe) – Eterm Dec 05 '22 at 09:01
  • 2
    What are you trying to do? What does `//some logic here` contain? That's the most important part. `List` isn't thread-safe and `Parallel.For` isn't meant for asynchronous code, it's meant for data parallelism, ie processing large amounts of in-memory data using all available cores. Adding 1M list items one by one is *very* inefficient and slow, and ends up allocating twice the RAM due to temporary buffers. – Panagiotis Kanavos Dec 05 '22 at 09:05
  • Without information about `//some logic here`, I will advice droping any kind of parrallelism for now. Or you may be shooting your self in the foot, down the road, by a factor of x2, x4, or even a deadlock, thread starvation. And will have to debug a code without the ability to step-in. A code that run way faster, but sometimes doesn't come back. Or come back without finishing. – Drag and Drop Dec 05 '22 at 09:41
  • @PanagiotisKanavos *"`Parallel.For` isn't meant for asynchronous code"* -- AFAICS the OP is not using the `Parallel.For` with asynchronous lambda. – Theodor Zoulias Dec 05 '22 at 11:48
  • 1
    @TheodorZoulias check the tags and question ? – Panagiotis Kanavos Dec 05 '22 at 11:50
  • @PanagiotisKanavos I removed the `asynchronous` tag. It doesn't seem relevant with the title and body of the question. – Theodor Zoulias Dec 06 '22 at 02:31

2 Answers2

4

If you know how many items you will generate, just use an array:

var someList = new ListItem[1000000];
Parallel.For(0, someList.Length, i =>
{
    var item = new ListItem();

    //some logic here

    someList[i] = item;
});
JonasH
  • 28,608
  • 2
  • 10
  • 23
  • +1. Alternatively you could use the `Parallel.ForEach` with the `Partitioner.Create` as source, as shown [here](https://stackoverflow.com/questions/70839214/adding-two-arrays-in-parallel/70840104#70840104). It should be slightly more efficient. – Theodor Zoulias Dec 05 '22 at 13:38
2

The List<T> is not thread-safe. You can use a ConcurrentBag<T>, although that is an ICollection<T>, not an IList<T>. If you need the latter then you will have to synchronise the multiple threads, probably using a lock statement.

jmcilhinney
  • 50,448
  • 5
  • 26
  • 46
  • 1
    @EgorShoba you haven't described the problem yet. These are workaround and `lock` is actually a hack when talking about parallelism. Whatever the actual problem is there are better options. Eg if you care about order, `ConcurrentQueue` is a better option. `ConcurrentBag` is a specialized type that uses thread-local storage, offering fast local access but slower cross-thread access. For actual async operations, `Parallel.ForEachAsync` is the correct option, not `Parallel.ForEach` – Panagiotis Kanavos Dec 05 '22 at 11:55
  • I didn't downvote and don't think this should be downvoted even if it's not the best answer. – Panagiotis Kanavos Dec 05 '22 at 11:56
  • @EgorShoba unless you explain what the problem is, people will downvote this answer because it's not clear why it answers the question. – Panagiotis Kanavos Dec 05 '22 at 11:57