3

In JavaScript, single and double quotes are somewhat interchangeable and largely a matter of styles (There is a good discussion of why this isn't actually the case in one of the answers here: When to use double or single quotes in JavaScript?). How are chars and strings handled in C#?

For example:

string test = "hello world";
string test2 = 'hello world'; // Too many characters in character literal
char test3 = 'a';
char test4 = "a"; // Cannot implicitly convert type string to char

It looks like strings and chars are being handled as separate, interchangeable types, and that the use of single or double quotes demarcates this?

What is the relationship between chars and strings in typed languages? Specifically, would it be correct to say that a string is an array of chars?

Community
  • 1
  • 1
Zach Smith
  • 8,458
  • 13
  • 59
  • 133
  • definitely, string is array of chars. – A.T. Jan 10 '17 at 08:39
  • 1
    Even "untyped" languages may handle strings and chars differently, especially when dealing with Unicode. You can't very well ask for the Unicode range of an entire string. Furthermore, each language handles strings differently from any other. – Panagiotis Kanavos Jan 10 '17 at 08:39
  • 2
    Strings aren't just arrays of chars. They are immutable, which means that you can't modify them as if they were just an array of chars. This allows the runtime to *intern* them and use a single instance of a string instead of creating copies. – Panagiotis Kanavos Jan 10 '17 at 08:40
  • Related: [Single quotes vs. double quotes in C or C++](http://stackoverflow.com/q/3683602/1324033) (Previous suggested duplicate was provided by a sub 3k user) – Sayse Jan 10 '17 at 08:49

2 Answers2

5

would it be correct to say that a string is an array of chars

In .NET, a string is an object containing a contiguous block of memory containing UTF-16 code units. A char is another (primitive) data type that just contains one code point, with no object overhead.

From this interesting blog post from Jon Skeet, where he compares the .NET vs. Java implementation:

A long string consists of a single large object in memory. Compare this with Java, where a String is a “normal” type in terms of memory consumption, containing an offset and length into a char array – so a long string consists of a small object referring to a large char array.

Patrick Hofman
  • 153,850
  • 22
  • 249
  • 325
  • It's worth noting that you can _treat_ a string like a char[] even in .NET, it has an [indexer](https://msdn.microsoft.com/en-us/library/system.string.chars(v=vs.110).aspx) which you can use to access a char at a given index. – Tim Schmelter Jan 10 '17 at 09:18
  • 1
    Indeed. It mimics that very good, and that is why often people think it is an array of chars. – Patrick Hofman Jan 10 '17 at 09:20
  • @TimSchmelter it is, hence my answer below. – Keith Jan 10 '17 at 09:22
  • 1
    @Keith: your answer is still a bit misleading because there is no collection of chars in the string class, the [indexer](https://referencesource.microsoft.com/#mscorlib/system/string.cs,8307d03426b56fe1,references) is implemented externally. As Patrick has said, it mimics a `char[]` (or collection of chars). [`ToCharArray`](https://referencesource.microsoft.com/#mscorlib/system/string.cs,81c2d980f5d0ee35,references) also always creates a new one. – Tim Schmelter Jan 10 '17 at 09:24
  • @TimSchmelter Yeah, that's why `ToCharArray` is a method rather than a cast or property. I've clarified that point. – Keith Jan 10 '17 at 09:59
  • 1
    @Keith: well, `string.ToString` is also a method but returns just `this`. A method is no guarantee that it doesn't use and return fields (or even the instance itself). – Tim Schmelter Jan 10 '17 at 10:03
  • @TimSchmelter Yeah, but that's inherited from `object`, and not every `whatever.ToString()` can do that (some are going to involve much more work) - because it's from `object` it's kind of stuck with the lowest common denominator. – Keith Jan 10 '17 at 10:09
  • @TimSchmelter as a very rough (there are exceptions) rule in .NET (in the MS code standards stuff anyway) a method may or may not have to do lots of work or create new things each time it's called, but properties should be quick and idempotent. A property called `String.CharArray` would reasonably allow some assumptions, a method called `String.ToCharArray` does not (so assume the worst). – Keith Jan 10 '17 at 10:13
2

C# uses the quotes to indicate the type - ' is always a char, " is always a string.

In .NET a string behaves like a read-only array of char, so:

// Treat string like an array
char c = "This is a string"[3];
int len = "This is a string".Length

// Now we have the char at pos 3
c == 's';

What you can't do is edit them:

// This fails
"This is a string"[3] = 'x';

// This is fine, because we get a new editable char[]
char[] c = "This is a string".ToCharArray();
c[3] = 'x';

This is true in any .NET language as they all use the same string implementation. Other strongly typed frameworks have different ways of handling strings.

In .NET char can be explicitly cast to an int and implicitly cast back, so:

char c = (char) 115; // 's'
int i = c; // 115

Finally char is a value type, while string (being a collection of character bytes under the covers) is actually an immutable reference type. These behave very similarly in code but are stored in different ways in memory - I think this is why C# makes the distinction in how they are delimited - "s" and 's' are different things and stored in different ways (entirely unlike the Javascript string).

Keith
  • 150,284
  • 78
  • 298
  • 434
  • @TimSchmelter it's not a `Collection` or anything like that - the indexer doesn't wrap a nested `Array` and internally it's just bytes. I think the indexer just copies the relevant pair of bytes. – Keith Jan 10 '17 at 09:56