2

I asked this on Taking an index out of const char* argument along with another question, but debate got only to the first question, so I splitted it to a different thread.

Question:

Is there any reason anyone would ever want to represent a C string as const char* and not as const unsigned char*?

  • On one hand, I see the commonly used representation of C str as const char* all the time.
  • On the other hand, using const char* sometimes forces a cast to unsigned, like in the example linked above.

Thanks,

Community
  • 1
  • 1
Elad Weiss
  • 3,662
  • 3
  • 22
  • 50
  • Having a hard time mincing your post to get the *real* question*. Are you asking why isn't `char` always unsigned as a rule of the language, rather than left to the implementation? – WhozCraig Jan 24 '17 at 11:17
  • @WhozCraig Yeah I guess. But I'll learn eventually :) That's the goal isn't it? – Elad Weiss Jan 24 '17 at 11:18
  • 1
    Yeah, I only asked because as-posted your question seems more about `char` and `unsigned char` (which `char` may be anyway) and less about const pointers to said same. I'm sure there's duplicate of this somewhere on the site (would be amazed if there wasn't), but no sense in looking for it unless that really is the root of your question. Edit: [found one](https://stackoverflow.com/questions/914242/why-is-chars-sign-ness-not-defined-in-c). – WhozCraig Jan 24 '17 at 11:22

1 Answers1

4

Yes, of course a general read-only string should be const char *, since char (with unspecified implementation-specified signedness) is the default type for a character.

In other words, a literal like "foo" consists of char, not unsigned char, elements.

Of course you can interpret the characters as unsigned if you feel like it, but then you might need a cast.

unwind
  • 391,730
  • 64
  • 469
  • 606
  • Good answer. But like WhozCraig commented above, perhaps I should have asked why isn't char always signed as a rule of the language. – Elad Weiss Jan 24 '17 at 11:20
  • @EladWeiss Perhaps. This is known as the [XY Problem](http://mywiki.wooledge.org/XyProblem) and it's quite hard to figure out when it's occuring, sometimes. – unwind Jan 24 '17 at 11:43
  • 2
    @EladWeiss "why isn't char always signed as a rule" --> because enough early (maybe most) platforms were set up to operate with _unsigned_ `char` and some with _signed_ `char`. The language reflects the compromise that afforded the greatest portability. Note that with modern character sets (Unicode) , there are no negative code points and C's own `is...()` functions takes values in the `unsigned char` range and compares `char` with `strcmp()` as if they are `unsigned char`. – chux - Reinstate Monica Jan 24 '17 at 15:13
  • 1
    @whyoz Best to post your [C#, Obj-C question](https://stackoverflow.com/questions/41826615/c-str-representation-const-char-vs-const-unsigned-char/41826779?noredirect=1#comment85178021_41826779) than ask in a comment here under a C/C++ tags. – chux - Reinstate Monica Mar 03 '18 at 18:26