70

Why is adding two char in C# results to an int type?

For example, when I do this:

var pr = 'R' + 'G' + 'B' + 'Y' + 'P';

the pr variable becomes an int type. I expect it to be a string type with a value of "RGBYP".

Why is C# designed like this? Wasn't the default implementation of adding two chars should be resulting to a string that concatenates the chars, not int?

John Isaiah Carmona
  • 5,260
  • 10
  • 45
  • 79
  • 38
    The interesting thing is that you're not actually adding `char`s, since C# doesn't define a built-in + operator for the type. However, `char` is implicitly convertible to `int`, so the compiler chooses the `int` version of the + operator when doing overload resolution. And of course the result of that operator is another `int`. (Note that "of course" is a bit funny to say, since `short` + `short` is actually an `int`, rather than the "of course" answer of another `short`!) – dlev May 31 '13 at 07:25
  • 3
    One of the many features inherited from C. `char` is an integral type in C, and so is in C#. Changing the behaviour of the type without a very good reason (you can create strings from individual `char`s easily using other functionality) would annoy users that had grown accustomed to the C/C++ behaviour. – Gorpik May 31 '13 at 07:29
  • 2
    isn't char in it's most basic level a type of int ? – user13267 May 31 '13 at 10:05
  • 3
    title should be: char + char = int? [WAT](https://www.destroyallsoftware.com/talks/wat)? – Carlos Campderrós May 31 '13 at 13:22
  • 1
    Same answer as [byte + byte = int... why?](http://stackoverflow.com/q/941584) – Cody Gray - on strike May 31 '13 at 14:21
  • Note that `char` in C# is a 16-bit Unicode character, as opposed to `sbyte` which is the equivalent of C/C++ `char`. – Alvin Wong May 31 '13 at 15:13
  • This is straight backward compatibility with C. The concept of an actual character type would make the heads of the C# designers explode. – Kaz May 31 '13 at 20:55

15 Answers15

81

Accoding to the documentation of char it can be implicitly converted into integer values. The char type doesn't define a custom operator + so the one for integers is used.

The rationale for there being no implicit conversion to string is explained well in the first comment from Eric Lippert in his blog entry on "Why does char convert implicitly to ushort but not vice versa?":

It was considered in v1.0. The language design notes from June 6th 1999 say "We discussed whether such a conversion should exist, and decided that it would be odd to provide a third way to do this conversion. [The language] already supports both c.ToString() and new String(c)".

(credit to JimmiTh for finding that quote)

dimitar.bogdanov
  • 387
  • 2
  • 10
Dirk
  • 10,668
  • 2
  • 35
  • 49
  • 16
    Just to add the reason why it was designed like this - specifically why there's no implicit conversion to string (whether we agree with the rationale or not), according to Eric Lippert: 'It was considered in v1.0. The language design notes from June 6th 1999 say "We discussed whether such a conversion should exist, and decided that it would be odd to provide a third way to do this conversion. [The language] already supports both c.ToString() and new String(c)".' – JimmiTh May 31 '13 at 07:42
  • @JimmiTh Very nice! Do you have a link to that quote by chance? – Dirk May 31 '13 at 07:46
  • 7
    http://blogs.msdn.com/b/ericlippert/archive/2009/10/01/why-does-char-convert-implicitly-to-ushort-but-not-vice-versa.aspx - it's in Lippert's answer to the first comment. – JimmiTh May 31 '13 at 07:47
  • @jadarnel27 Thanks for the edit and for fixing the language. – Dirk May 31 '13 at 13:46
  • No problem =) I don't think there was anything wrong with your wording. I just wanted to make sure the great comment from @JimmiTh stuck around. – Josh Darnell May 31 '13 at 13:53
  • I submitted an edit, but I'll link it here as well - the original blog post link is dead - here is a web archive link: http://web.archive.org/web/20140414050135/http://blogs.msdn.com/b/ericlippert/archive/2009/10/01/why-does-char-convert-implicitly-to-ushort-but-not-vice-versa.aspx – dimitar.bogdanov Apr 11 '21 at 13:27
17

char is a value type, meaning it has a numerical value (its UTF-16 Unicode ordinal). However, it is not considered a numeric type (like int, float, etc) and therefore, the + operator is not defined for char.

The char type can, however, be implicitly converted to the numeric int type. Because it's implicit, the compiler is allowed to make the conversion for you, according to a set of rules of precedence laid out in the C# spec. int is one of the first things normally tried. That makes the + operator valid, and so that's the operation performed.

To do what you want, start with an empty string:

var pr = "" + 'R' + 'G' + 'B' + 'Y' + 'P';

Unlike the char type, the string type defines an overloaded + operator for Object, which transforms the second term, whatever it is, into a string using ToString() before concatenating it to the first term. That means no implicit casting is performed; your pr variable is now inferred as a string and is the concatenation of all character values.

KeithS
  • 70,210
  • 21
  • 112
  • 164
  • This gets obviously optimized, but if you are adding characters in a for-loop, use StringBuilder or List instead. – Ferazhu Apr 13 '23 at 18:32
8

Because a single char can be converted to a Unicode value and can be easily stored as integer taking up less space than a single character string.

Ted
  • 3,985
  • 1
  • 20
  • 33
  • 1
    are there any sources to back that one up? why it's happening like that? – aiapatag May 31 '13 at 07:23
  • 2
    `char` is a unicode code point, not an ASCII character. – Dirk May 31 '13 at 07:24
  • 2
    @Dirk - I hope my EDIT covers you, please read more carefully before downvoting... – Ted May 31 '13 at 07:29
  • @Ted A C# `char` cannot be traced back to an ASCII table, since that table does not represente all Unicode characters. Just replace ASCII with Unicode in your original answer and you are fine. – Gorpik May 31 '13 at 07:32
  • @Ted I didn't downvote. I just pointed out that since a `char` is a Unicode code point, encoding as UTF-16. – Dirk May 31 '13 at 07:33
7

From the MSDN:

The value of a Char object is a 16-bit numeric (ordinal) value.

A char is an integral type. It is NOT a character, it is a number!

'a' is just shorthand for a number.

So adding two character results in a number.

Have a look at this question about adding bytes, it is, although counterintuitive, the same thing.

Community
  • 1
  • 1
Emond
  • 50,210
  • 11
  • 84
  • 115
6

Another relevant bit of the spec, in section 4.1.5 (Integral Types) having defined char as an integral type:

For the binary + ... operators, the operands are converted to type T, where T is the first of int, uint, long and ulong that can fully represent all possible values of both operands.

So for a char, both are converted to int and then added as ints.

Rawling
  • 49,248
  • 7
  • 89
  • 127
  • 1
    [Not really](http://stackoverflow.com/questions/127776/where-can-you-find-the-c-sharp-language-specifications), I think you have to [download the whole thing](http://www.microsoft.com/en-gb/download/details.aspx?id=7029). – Rawling Jun 01 '13 at 10:42
6

The point is, that many C# concepts are coming from C++ and C.

In these languages a single character constant (like 'A') is represented as their Ascii value, and despite what one may expect, it's type is not char but int (yes 'A' is an int, the same as writing 65).

Thus, the addition of all these values is like writing a series of ascii character codes, i.e.

   var pr= 82 + 71 + 66 + ...;

This has been a design decision in C / C++ at some point (its going back to the 70's with C).

The-Dood
  • 81
  • 4
5

From MSDN:

Implicit conversions might occur in many situations, including method invoking and assignment statements.

A char can be implicitly converted to ushort, int, uint, long, ulong, float, double, or decimal. Thus that assignment operation implicitly converts char to int.

aiapatag
  • 3,355
  • 20
  • 24
  • 2
    The assignment is *not* what invokes the implicit conversion. By the time assignment occurs, there is an `int` value waiting from the result of the `int` + operator. – dlev May 31 '13 at 07:27
4

As has been said, it is because a char has the Int32 value containing its unicode value.

If you want to concatenate chars into a string you can do one of the following:

Pass an array of chars to a new string:

var pr = new string(new char[] { 'R', 'G', 'B', 'Y', 'P' });

Use a StringBuilder:

StringBuilder sb = new StringBuilder();
sb.Append('R');
etc...

Start off with a string:

var pr = string.Empty + 'R' + 'G' + 'B' + 'Y' + 'P';

Cast each to a string (or just the 1st one will work just as well):

var pr = (string)'R' + (string)'G' + (string)'B' + (string)'Y' + (string)'P';
Ashigore
  • 4,618
  • 1
  • 19
  • 39
  • 2
    Its Unicode value, not it ASCII code. C# does not use ASCII. – Gorpik May 31 '13 at 07:33
  • @Gorpik Well I meant in this particular case, since the Unicode values of the ASCII table are the same are they not? – Ashigore May 31 '13 at 07:37
  • Yes, they are, because Unicode was designed to maintain compatibility with ASCII. But why say something that is just coincidental and for a small subset of all possible `char` values when you can say the right thing just as easily? C# uses Unicode, not ASCII. – Gorpik May 31 '13 at 07:42
4

A char or System.Char is an integral type:

An integral type representing unsigned 16-bit integers with values between 0 and 65535. The set of possible values for the type corresponds to the Unicode character set.

This means that it behaves exactly like a uint16 or System.UInt16, and adding chars with the + operator therefore adds the integral values, because the + operator is not overloaded in char.

To concatenate individual chars into a string use StringBuilder.Append(char) or new String(char[]).

John Willemse
  • 6,608
  • 7
  • 31
  • 45
3

It shouldn't because that would be inefficient. If one wanted to concatenate the chars like that they should use string builder. Otherwise each addition would create a temporary memory to hold the concatinated partial string, which would mean that in your example 4 temporary memory allocations would have to occur.

Lefteris E
  • 2,806
  • 1
  • 24
  • 23
  • 1
    Efficiency is irrelevant here, it's the semantics of the language that decide this. It's also inefficient in the same manner to concatenate strings like `"foo" + "bar"`, but that doesn't stop it from being possible, and noone would use a language whose designers decided to not have a string concatenation operator in the name of protecting you from yourself. – Jon May 31 '13 at 09:00
1

A Char is a textual representation of a 16-bit integer value. You are simply adding ints together. If you want to concatenate chars, you'll have to cast them to strings.

Captain Kenpachi
  • 6,960
  • 7
  • 47
  • 68
1

1) Definition (MSDN):

The char keyword is used to declare a 16-bit character, used to represent most of the known written languages throught the world.


2) Why char does like numeric types?

A char can be implicitly converted to a numeric type.

A char is closer to an integer than to a string. A string is only a collection of char objects, whereas an integer can present a char and vice versa.


3) Examples

You can simply convert the first of your chars to a string, to outwit your compiler:

var pr = 'R'.ToString() + 'G' + 'B' + 'Y' + 'P';

You could also define a char array and then use the string constructor:

char[] letters = { 'R', 'G', 'B','Y', 'P' };
string alphabet = new string(letters);

If you want to print out a character solely, you always have to convert it to a string, to get its text representation:

 var foo1 = 'F';
 MessageBox.Show(foo1.ToString());
Fabian Bigler
  • 10,403
  • 6
  • 47
  • 70
0

Why is C# designed like this? Wasn't the default implementation of adding two chars should be resulting to a string that concatenates the chars, not int?

What you intended is not correct in respect to what you want to accomplish. A String is not an addition of chars, a String is an addition of so to say "singleton" strings.

So "a"+"b"=>"ab", which is absolutely correct if you take into account, that the + operator for strings is overloaded. And hence 'a' represents the ASCII char 65, it is totally consistent to say, that 'a'+'b' is 131.

Thomas Junk
  • 5,588
  • 2
  • 30
  • 43
0

Because a char plus another char can exceed the maximum value permitted for a char variable, that's why the result of that operation is converted to a int variable.

Nahuel Barrios
  • 1,870
  • 19
  • 22
  • Wrong. short + short can exceed the value of a short, just like any other two numeric types can sum to a value larger than their type can store. In none of these cases is an automatic widening conversion of the number performed; the value simply overflows and the least significant bits are retained. Instead, this is happening because char does not define the + operator, but a type to which it can be implicitly converted (int) does define this operator. – KeithS May 31 '13 at 15:14
0

You are assuming that a char is a string type. The value of a char can be represented by a character value between single quotes, but if it helps, you should consider that to be an abstraction to provide readability, rather than forcing you as the developer to memorize the underlying value. It is, in fact, a numeric value type, so you should not expect any string manipulation functions to be applicable.

As to why why char + char = int? I have no idea. Certainly, providing implicit conversion to Int32 would mitigate arithmetic overflows, but then why is short + short not implicitly typed to int?

Sean H
  • 736
  • 6
  • 9