-1

edit : i strongly disagree with the closing, especially now that i found the answer. The question is very precise, the bonus questions are just bonus... The answer simply is :"you can't use List<T> without locking because it is using versioning internally and is enforcing data-consistency". (i posted the answer but i have to wait 1 day before accepting it)


Context :

  • I'm using NET Core 3.1 and C# in a WPF application.
  • I have a lot of threads using either explicitly started Task and Parallel.ForEach, all reading and potentially modifying a static List.
  • it is guaranteed that multiple different elements are modified by multiple thread at the same time BUT a single element will be only modified by single thread. (manual task do not modify the list, only Parallel.ForEach does, and i manage them carefully)
  • i'm using a very minimalist amount of lock : only when i add/remove an element
  • The method where the exception is raised is called by this code : CompositionTarget.Rendering = New EventHandler(CompositionTargetRendering)

The problem :

  1. The main UI thread is doing absolutely nothing except reading the static list in a (non-parallel) foreach loop to visualise the list without locking it.
  2. I have an exception System.InvalidOperationException : Collection was modified; enumeration operation may not execute; saying that the Collection was modified while the main WPF thread was looping over it to visualise it. (which is true, i am doing it)
  3. Even if i lock the list while add/removing element (which make sense and it's ok since it doesn't happen too often), it still crash with an exception.
  4. It require to add a lock on the main UI thread, but then my list would be more or less permanently locked by the UI thread and scalability will drop to nil.

Question : How to constantly read(-only) this list without raising an exception and without locking it ?

My temporary workaround : It is ok if the visualization isn't "perfectly accurate", that why i can afford to modify it while visualizing it.

So i'm visualizing a copy using List<T>.ToArray. It works for now, but as the objects stored in the list will grow (a lot) in the future, memory usage will be a problem (many gigabyte). I can't afford to double the memory usage just for this.


Bonus question :

  • i honestly don't understand why it only raise an exception here. I'm reading and writing the list all over the place without any lock and without problem, except in the WPF thread ?

  • And, from the workaround, isn't creating a copy an access to this list anyway ? Why doesn't it raise an exception every single time i read it concurrently ? Is it a WPF thing ?

  • If the EventHandler is the problem, and you know another way to manually refresh my XAML <Image>, i'm all ears. It's the only way i know.

ker2x
  • 440
  • 3
  • 11
  • The bonus questions are "broad". But it's just bonus question. i tried to have my main question as contextualized and precise as possible. Feel free to ignore them. – ker2x May 20 '20 at 21:08
  • [ConcurrentBag](https://learn.microsoft.com/en-us/dotnet/api/system.collections.concurrent.concurrentbag-1?view=netcore-3.1)? – Guru Stron May 20 '20 at 21:08
  • @GuruStron That's just having another class do the locking for you. The locking is still happening. – Servy May 20 '20 at 21:11
  • @Servy I thought that there was also some high level lock-free magic there =) – Guru Stron May 20 '20 at 21:12
  • ConcurrentBag will just hide the lock from me, doesn't it ? It would make the scalability problem even worse as it will lock even when i don't need it. Or am i misunderstanding it ? – ker2x May 20 '20 at 21:13
  • @GuruStron It minimizes the synchronization that happens in certain cases, that doesn't mean there isn't *any* synchronization (a `List` is what you get when there isn't any synchronization at all). – Servy May 20 '20 at 21:14
  • @ker2x It will lock in some cases, obviously, but not sure that you can get out without synchronization in your scenario. – Guru Stron May 20 '20 at 21:15
  • @Servy, yes, I understand that. But the situation seems like that synchronization will be needed. – Guru Stron May 20 '20 at 21:16
  • @GuruStron *Some* type of synchronization is *probably* needed, but it's hard to say how much is or is not needed with so little information about the specifics of the situation. – Servy May 20 '20 at 21:19
  • @Servy agreed. But that's why it was suggestion in comments, not an answer. OP can try it and test performance. – Guru Stron May 20 '20 at 21:21
  • 1
    @ker2x Also you can look at [ReaderWriterLockSlim](https://learn.microsoft.com/en-us/dotnet/api/system.threading.readerwriterlockslim?view=netcore-3.1) ( if you have not already), with UI obviously being a one of readers, and adders/removers - writers. Again, there would be locking, but you need to test if performance is suitable. – Guru Stron May 20 '20 at 21:25
  • @GuruStron That wouldn't change anything here. The advantage of a read/write lock is that multiple readers can hold the lock at the same time, while only writes are exclusive. But here there is only ever exactly one reader, and lots of writers, so there would never be any shared locks, so you'd end up with just a poorer performing regular lock. – Servy May 20 '20 at 21:27
  • 1
    Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/214289/discussion-between-guru-stron-and-servy). – Guru Stron May 20 '20 at 21:27
  • 1
    @Servy : that's a dilemna for me. Every time i wrote very large topic it got downvoted or locked because the quesiton was lost in all the information. So i tried to be concise. If i was doing it in C/C++ i know for sure that the one and only lock that would be needed would be when removing an element (to avoid looping out of bound). Even adding wouldn't be a problem, it would just mean that the element would be eventually ignored on the cycle it was created. Which is fine. It would also cause visualization inaccuracy which is also fine. – ker2x May 20 '20 at 21:33
  • 1
    @ker2x Stating that you don't care about inaccurate or inconsistent data is rather relevant here. Generally having something produce stale our out of date data, or an inconsistent view of a collection (i.e. a state it was never actually in at any point in time) is a problem, and solving it complicates things a lot. You should *certainly* mention that in the question in an edit. – Servy May 20 '20 at 21:38
  • The moere I look at it, the more I notice the issues with this question: Way to wide problem description. No code. And propably a XY Problem with wanting to solve the "GUI takes to long to update" (X) with Inmutable Lists (Y). – Christopher May 20 '20 at 21:46
  • Nope, sorry, the UI is very fast andits speed is irrelevant. it would be ok even if it was (very) slow. – ker2x May 20 '20 at 21:53
  • @GuruStron : thank you for the ReaderWriterLockSlim suggestion (i have finally read the documentation about it). It will be very useful :) – ker2x May 21 '20 at 11:40

3 Answers3

0

So, i checked the Net Core source code : all Lists have a "version".

If the version change while its being enumerated (and it can change very easily) : it throw an exception.

it's an hardcoded behavior in Collection.Generic.List. Using List will never work unless i use extremely unsafe & ugly code.

if (_version != _sortedList.version) 
    throw new InvalidOperationException(SR.InvalidOperation_EnumFailedVersion);

.NET Framwork have a similar code :

if (version != _version && BinaryCompatibility.TargetsAtLeast_Desktop_V4_5)
     ThrowHelper.ThrowInvalidOperationException(ExceptionResource.InvalidOperation_EnumFailedVersion);

So the only reason it crashed only in the UI thread is because it's read so often. I was just being mislead by this behavior but it's a global problem.

ker2x
  • 440
  • 3
  • 11
-2

I have an exception System.InvalidOperationException : Collection was modified; enumeration operation may not execute;

You are using a foreach loop. foreach only work on Enumerators, as that is what they were added to support. Every Collection can be turned into a enumerator, but there is one important limitation:

If the collection changes, all Enumerators should become invalid.

The solution is to stop using a foreach. However, Lists are also inherently not useable for concurrent access. None of the indexed Collections is. You should be changing to something like a ConcurrentDictionary. But even that might not be enough.

My biggest confusion is that you are not just locking access to the collection from all endpoints, including the GUI. The Code should at tops run 30 times per second. And if that much updating disrupts operation, it needs to be even less then that.

You may be able to get some better performance out of the GUI, if you avoid doing string-concatenation on the GUI Elements .Text Property. Working directly on the GUI classes has considerable overhead. I wrote a example to demonstrate that:

using System;
using System.Windows.Forms;

namespace UIWriteOverhead
{
    public partial class Form1 : Form
    {
        public Form1()
        {
            InitializeComponent();
        }

        int[] getNumbers(int upperLimit)
        {
            int[] ReturnValue = new int[upperLimit];

            for (int i = 0; i < ReturnValue.Length; i++)
                ReturnValue[i] = i;

            return ReturnValue;
        }

        void printWithBuffer(int[] Values)
        {
            textBox1.Text = "";
            string buffer = "";

            foreach (int Number in Values)
                buffer += Number.ToString() + Environment.NewLine;
            textBox1.Text = buffer;
        }

        void printDirectly(int[] Values){
            textBox1.Text = "";

            foreach (int Number in Values)
                textBox1.Text += Number.ToString() + Environment.NewLine;
        }

        private void btnPrintBuffer_Click(object sender, EventArgs e)
        {
            MessageBox.Show("Generating Numbers");
            int[] temp = getNumbers(10000);
            MessageBox.Show("Printing with buffer");
            printWithBuffer(temp);
            MessageBox.Show("Printing done");
        }

        private void btnPrintDirect_Click(object sender, EventArgs e)
        {
            MessageBox.Show("Generating Numbers");
            int[] temp = getNumbers(1000);
            MessageBox.Show("Printing directly");
            printDirectly(temp);
            MessageBox.Show("Printing done");
        }
    }
}

But at the end, you will have to drop the GUI Updates until it stops being an issue for the process.

Christopher
  • 9,634
  • 2
  • 17
  • 31
  • Comments are not for extended discussion; this conversation has been [moved to chat](https://chat.stackoverflow.com/rooms/214298/discussion-on-answer-by-christopher-read-only-acces-to-static-listt-in-massive). – Samuel Liew May 21 '20 at 02:32
-2

Use only immutable lists

Do not update the existing list. Instead, create a new list with the revisions you need, then update the reference to point to the new list.

List<Foo> _masterList = new List<Foo>; //This is the list your UI thread will access


Task WorkerThread()
{
    lock (lockObject)
    {
        //This is how you add to the list without mutating it
        _masterList = _masterList.Concat( newItem ).ToList();

        //Here's how you remove an item with Id = x without mutating it
        _masterList = _masterList.Where( item => item.Id != x ).ToList(); 
    }
}

In no case are you mutating a list, so the exception "collection was modified" will never occur.

You still need to lock when adding or remove (otherwise some adds or removes may get lost) but you no longer need to lock anything to access the list for read, since the reference itself is atomic and the thing it points to never changes.

John Wu
  • 50,556
  • 8
  • 44
  • 80
  • 1
    If you're going to use immutable lists, then use an actual immutable data structure, rather than a `List` that you re-create from scratch entirely every time you make a change. Worse still, in your case you'd need to lock around every operation that re-creates the list from scratch, which would just *kill* the scalability here. It'll be worse than just locking around mutations to a mutable data structure. And if you didn't do that, you'd drop changes on the floor. – Servy May 20 '20 at 21:21
  • 2
    This will introduce another problem, which OP stated needed to be avoided - "but as the objects stored in the list will grow (a lot) in the future, memory usage will be a problem (many gigabyte). I can't afford to double the memory usage just for this." – Guru Stron May 20 '20 at 21:22
  • Yep, creating one copy is already a problem, so creating multiple copy (from multiple thread) would be even worse. – ker2x May 20 '20 at 21:22
  • I understand your objections and can't say I disagree with all of them. But I would like to clarify a couple things. (1) The size of the objects has absolutely nothing to do with the size of the list, and (2) ToArray() will actually be less efficient than ToList() (see [this answer](https://stackoverflow.com/a/16323412/2791540).) – John Wu May 20 '20 at 21:49
  • I just took the first thing that came to mind as it is supposed to be a temporary solution. i'll switch to ToList. The size of the object have nothing to do with the list size, true. but it have something to do with the amount of memory it use, and i'm memory constrained here. I'd like to avoid copying my data if able. – ker2x May 20 '20 at 21:52
  • @JohnWu The fact that `ToArray` is sometimes slower than `ToList` doesn't change the fact that your code does *even more copying of data* because you create way more new lists than the OP does, and, more importantly, your solution would require holding onto a lock while doing that even more expensive copying, inhibiting performance *drastically* as you end up with lots of thread waiting instead of doing productive work. – Servy May 20 '20 at 21:53
  • @Servy (1) `ToArray()` is not only slower but it requires more memory allocations. (2) If you review the [reference source for `List`](https://referencesource.microsoft.com/#mscorlib/system/collections/generic/list.cs,3d46113cc199059a) you'll see an awful lot of `Array.Copy` calls happening under the hood. These don't happen when you create a list from scratch because there is no need to move things around. I wouldn't necessarily conclude my approach results in a lot more copying (3) The O/S won't schedule a thread that is waiting on a lock, so productive work continues on other threads.. – John Wu May 20 '20 at 22:01
  • @JohnWu The problem is your solution is turning every single call to List.Add or List.Remove into reconstructing a new list from the old list, but with one change. Every single change to the list in the OP's code *doesn't require re-creating the list from scratch*, but your solution does. And since that copy is *inside a lock*, that's a big problem. Now sure, if the OP wants to continue using their "copy it to another structure and then move on" they should use `ToList` and not `ToArray`, because it's slightly better, but your changes offeset that *and a lot more* in the other direction. – Servy May 20 '20 at 22:03
  • 1
    Why not ImmutableList then? – pinkfloydx33 May 20 '20 at 22:04
  • @pinkfloydx33 That is a great point. `ImmutableList` operates more or less the same way as I am describing and is available on .NET Core 3.1, so that is actually a better solution.. – John Wu May 20 '20 at 22:14
  • 2
    @JohnWu It doesn't operate like this at all. The whole point of having the immutable collections in .NET is so that they can avoid copying all of the data any time you make any change. – Servy May 20 '20 at 22:18