3

Let's say I have a class with a string property

public class Something
{
    public int SomeIntProperty { get; set; }
    public string SomeStringProperty { get; set; }  
}

and let's say that the SomeStringPropertys can be very long and I want to create a dictionary

this.dic = somethings
    .GroupBy(s => s.SomeStringProperty)
    .ToDictionary(g => g.Key);

that I hold in memory for the duration of when my application is running. My question is whether, due to the way that strings act like value types, that will end up duplicating the strings to hold in the dictionary. If so, what is a workaround so that I can instead hold references to the strings, or compress/hash/etc. them?

Sergey Kalinichenko
  • 714,442
  • 84
  • 1,110
  • 1,523
Ms. Corlib
  • 69
  • 2

3 Answers3

3

My question is whether, due to the way that strings act like value types, that will end up duplicating the strings to hold in the dictionary?

Strings in C# are not value types, and they are most certainly do not act like ones.

C# strings are immutable, which makes them suitable for use as keys in associative containers. However, using strings as keys, or in any other capacity for that matter, does not result in cloning of their content.

You can verify that no cloning is going on by checking for reference equality of your dictionary keys to SomeStringProperty of your source array. Each key in the dictionary will be present in the source array:

var data = new[] {
    new Something {SomeIntProperty=1, SomeStringProperty="A"}
,   new Something {SomeIntProperty=2, SomeStringProperty="A"}
,   new Something {SomeIntProperty=3, SomeStringProperty="A"}
,   new Something {SomeIntProperty=4, SomeStringProperty="A"}
,   new Something {SomeIntProperty=5, SomeStringProperty="A"}
,   new Something {SomeIntProperty=6, SomeStringProperty="B"}
,   new Something {SomeIntProperty=7, SomeStringProperty="B"}
,   new Something {SomeIntProperty=8, SomeStringProperty="C"}
,   new Something {SomeIntProperty=9, SomeStringProperty="D"}
};
var dict = data.GroupBy(s => s.SomeStringProperty)
                     .ToDictionary(g => g.Key);
foreach (var key in dict.Keys) {
    if (data.Any(s => ReferenceEquals(s.SomeStringProperty, key))) {
        Console.WriteLine("Key '{0}' is present.", key);
    } else {
        Console.WriteLine("Key '{0}' is not present.", key);
    }
}

The above code prints

Key 'A' is present.
Key 'B' is present.
Key 'C' is present.
Key 'D' is present.

Demo.

Sergey Kalinichenko
  • 714,442
  • 84
  • 1,110
  • 1,523
1

due to the way that strings act like value types

strings are not value type, they are immutable reference types.

that will end up duplicating the strings to hold in the dictionary

wrong, you will only end up creating new string if you try to modify them. then a new string with new content will be created.

M.kazem Akhgary
  • 18,645
  • 8
  • 57
  • 118
1

The documentation describes passing String variables by value, which is misleading in this case because, although the string is immutable, the runtime maintains a reference to the original value until we change it.

So, even though the Linq ToDictionary() method passes the string as an argument to Dictionary.Add() under the hood, both SomeStringProperty and the Dictonary key point to the same location in memory.

However, if we were to change the string in the key selector:

.ToDictionary(g => g.Key + "changed!"); 

...then the runtime will copy the original string value to create the new key.

We can verify that the reference is the same:

var first = this.dict.First();  
Console.WriteLine(object.ReferenceEquals(first.Key, first.Value.SomeStringProperty)); 

This article does a great job describing the nuances of String objects in C#.

Cy Rossignol
  • 16,216
  • 4
  • 57
  • 83
  • Great concise explanation. See also: [this answer](https://stackoverflow.com/questions/10792603/how-are-strings-passed-in-net) – mr_carrera Oct 13 '17 at 20:26