0

If we are trying to initialize a List with capacity and default value, just like how we initialize a vector in C++ with size and default value, say I'm creating a list of Dictionary to build a graph with weight, there are generally two ways of doing that:

1) Using for loop to build the list manually:

var graph= new List<Dictionary<int, int>>();
for (var i = 0; i < n; i++)
{
    graph.Add(new Dictionary<int, int>());
}

2) Using Enumberable.Repeat() to initialize the list:

var graph= Enumerable.Repeat(new Dictionary<int, int>(), n).ToList();

Say now we have a list of node-pair(edge) with weight and want to build a graph on top of that, we would write:

foreach (var list in lists) // List = [[0, 1, 100],[1,2,100],[0,2,500]]
{
    int a = list[0], b = list[1], weight= list[2];
    if (!graph[a].ContainsKey(b))
        graph[a].Add(b, weight);
}

While both approach should theoretically work, the result are quite different. When I try to print the graph with following code:

for (var i = 0; i < graph.Count; i++)
{
    var node = graph[i];
    foreach (var pair in node)
    {
        Console.WriteLine($"{i} -> {pair.Key} with price {pair.Value}");
    }
}

The result from 1) looks like:

// Correct output
0 -> 1 with price 100
0 -> 2 with price 500
1 -> 2 with price 100

But result from 2) looks like:

// Incorrect result
0 -> 1 with price 100
0 -> 2 with price 100
1 -> 1 with price 100
1 -> 2 with price 100
2 -> 1 with price 100
2 -> 2 with price 100

The behavior of Repeat method seems very odd in this case and on the Doc, there is no mention of such behavior.

My take on that behavior is the Repeat method create ONLY ONE value in the memory and point all list entry to the same memory location. In this case there is only one dictionary created in the memory instead of a new dictionary created for each list entry. And all operations on the dictionary actually happens at the same location. But this won't explain the weird output in the second result.

Any thought? Is it a bug in this Enumerable.Repeat() method or I'm doing it wrong? What is the better way to initialize a list with default value?

Aven
  • 501
  • 1
  • 5
  • 9
  • I don't think it does. – Aven Jan 26 '20 at 19:58
  • `Dictionary` is reference type, you repeat the value, that stored in the same location in memory. [sources](https://referencesource.microsoft.com/#System.Core/System/Linq/Enumerable.cs,89605215aab7080f) shows, that `RepeatIterator` returns the same value in a loop, without copying – Pavel Anikhouski Jan 26 '20 at 19:59
  • Also relevant: https://stackoverflow.com/questions/4347902/when-is-a-c-sharp-value-object-copied-and-when-is-its-reference-copied – Progman Jan 26 '20 at 19:59
  • @PavelAnikhouski But how does that explain the output? – Aven Jan 26 '20 at 20:02
  • 1
    @Aven you are modifying the same instance – Pavel Anikhouski Jan 26 '20 at 20:03
  • So this method is pretty much useless when you try to initialize with a non-primitive value? – Aven Jan 26 '20 at 20:12
  • 2
    `var graph = Enumerable.Range(0,n).Select(i => new Dictionary()).ToList();` I'ld prefer the loop. – Jimi Jan 26 '20 at 20:15
  • 1
    @Aven this method doesn't create a new copy of value, it just repeats the value. In case of reference types every value points to the same location in memory. Therefore you'll get an incorrect output, because modifying the same instance and can't add `[0,2,500]` to dictionary using the second way (key with value `2` already exists in this case) – Pavel Anikhouski Jan 26 '20 at 20:17

1 Answers1

0

Enumerable.Repeat simply duplicates the passed value n times. You create one dictionary, pass it to Repeat and it produces a list of n references to that dictionary.

If you want to generate a sequence based on a function, there's the MoreLinq Generate extension.

V0ldek
  • 9,623
  • 1
  • 26
  • 57