6

I have written a piece of optimized code that contains special cases for null and empty strings. I am now trying to write a unit test for this code. In order to do that, I need two empty (zero-length) string objects that are different objects. Like this:

string s1, s2;
Assert.IsTrue(s1.Length == 0 && s2.Length == 0);
Assert.IsTrue(!ReferenceEquals(s1, s2));

It turns out that most .NET Framework APIs string are checking for an empty result. They are returning string.Empty in all those cases. For example, "x".Remove(0, 1) returns string.Empty.

How can I create fresh zero-length string objects?

Pavel Anikhouski
  • 21,776
  • 12
  • 51
  • 66
boot4life
  • 4,966
  • 7
  • 25
  • 47
  • 1
    Strings that are not created in runtime are interned. So, strings with the same value reference the same object in common string pool. Why do you need this? – Vlad DX Feb 13 '20 at 17:24
  • 1
    If you check out the sources of the `String.Remove()`, you'll see that it returns `String.Empty`, yes. https://github.com/microsoft/referencesource/blob/master/mscorlib/system/string.cs#L2926 – Vlad DX Feb 13 '20 at 17:26
  • @dbc Same for `new string('X', 0);` See [Object.ReferenceEquals prints true for two different objects](https://stackoverflow.com/q/37068824/150605). – Lance U. Matthews Feb 13 '20 at 18:29

3 Answers3

10

There is no true 100% supported way of manufacturing a fresh zero-length string in .NET. As an implementation detail, existing string APIs may try to normalize zero-length return values to the instance string.Empty, but whether or not they do this consistently isn't something a developer should be relying on.

In particular, the other two answers have problems:

  • The string.Copy solution even includes the caveat that the method is obsolete in .NET Core 3.0. The method is likely to be removed entirely from a future version of .NET Core, so any solution which relies on calling string.Copy is going to break when the application eventually moves on to the new version of the runtime.

  • The FastAllocateString solution takes a dependency on an undocumented, internal API within the runtime. Internal APIs aren't guaranteed to stick around between versions. In fact, we're planning major changes in the way strings behave in the next version of .NET, and that work will almost certainly affect this internal API.

So, to your particular question as to whether there's a reliable way to manufacture a fresh zero-length string instance, the answer is no.

If you want to special-case zero-length strings in your code, the best solution would be to use the pattern if (myString.Length == 0) { /* ... */ }. The patterns if (myString == string.Empty) { /* ... */ } and if (myString == "") { /* ... */ } will also work, but their codegen won't be as optimized as the first proposal.

If you want to special-case null or empty strings, the best solution would be to use the existing string.IsNullOrEmpty API. The implementation of this method changes from version to version to take advantage of whatever JIT optimizations are available at the time.

Source: I am one of the primary developers on the System.String class.

Levi
  • 32,628
  • 3
  • 87
  • 88
  • 1
    I have written a custom string interning solution to save memory. That code optimizes all empty strings to `string.Empty`. I'd like to test that. Now given how hard it is to produce an empty string I also could simply delete this optimization. – boot4life Feb 13 '20 at 20:16
  • If memory is a concern for your application you should definitely check out the [string deduplication](https://github.com/dotnet/runtime/blob/98d47d5fc0f1b232869a9abd27729235fd7355b0/docs/design/features/StringDeduplication.md) feature that's being prototyped in .NET 5. The tl;dr of this feature is that during GC, equivalent strings will be collapsed to the same instance wherever possible, freeing up memory. – Levi Feb 13 '20 at 20:43
4

You can use the Obsolete method String.Copy

string s1 = "";
string s2 = String.Copy("");
Assert.IsTrue(s1.Length == 0 && s2.Length == 0);
Assert.IsTrue(!ReferenceEquals(s1, s2));
jbtule
  • 31,383
  • 12
  • 95
  • 128
  • 1
    I'm surprised this works since some of the `string` methods are special-cased to return `string.Empty` when they can. The documentation even says "because of changes in string interning in .NET Core 3.0, in some cases the `Copy` method will not create a new string but will simply return a reference to an existing interned string", yet this works for me on that version (perhaps because `string.IsInterned(string.Empty)` returns `false`). – Lance U. Matthews Feb 13 '20 at 18:42
  • This still exists in .NET Core 3.0 although there was talk on GitHub of removing it or making it just return it's input. I tested this on .NET Framework 4.8. – boot4life Feb 13 '20 at 18:42
2

You can use FastAllocateString method for that (it's being used internally in String and StringBuilder classes). Since it has internal static modifier, you should use reflection to invoke. But it returns two different empty strings in a memory

var fastAllocate = typeof(string).GetMethods(BindingFlags.NonPublic | BindingFlags.Static)
    .First(x => x.Name == "FastAllocateString");

string s1 = (string)fastAllocate.Invoke(null, new object[] { 0 });
string s2 = (string)fastAllocate.Invoke(null, new object[] { 0 });
var zeroLength = s1.Length == 0 && s2.Length == 0;
var notEqual = !ReferenceEquals(s1, s2);

Both checks returns true here

Pavel Anikhouski
  • 21,776
  • 12
  • 51
  • 66