14

I see that people often write C code such as:

char *ptr = malloc(sizeof(char)*256);

Is that really necessary? The standard says that sizeof(char)==1 by definition, so doesn't it make sense just to write:

char *ptr = malloc(256);
bodacydo
  • 75,521
  • 93
  • 229
  • 319
  • 1
    See: http://stackoverflow.com/questions/2215445/are-there-machines-where-sizeofchar-1 – Saul Jul 01 '10 at 22:13
  • 2
    It's also worth noting that the sizeof() operator is evaluated at compile time, so there's no runtime performance hit to using sizeof(). (Except in the case where an array's length is specified at runtime, but that's not the example you used.) – poundifdef Jul 01 '10 at 22:24
  • 2
    calloc is "slight slower since it had to clear allocated memory with zeroes. – YeenFei Jul 02 '10 at 01:35

6 Answers6

27

Yes, C defines sizeof(char) to be 1, always (and C++ does as well).

Nonetheless, as a general rule, I'd advise something like:

char *ptr = malloc(256 * sizeof(*ptr));

This way, when your boss says something like: "Oh, BTW we just got an order from China so we need to handle all three Chinese alphabets ASAP", you can change it to:

wchar_t *ptr // ...

and the rest can stay the same. Given that you're going to have about 10 million headaches trying to handle i18n even halfway reasonably, eliminating even a few is worthwhile. That, of course, assumes the usual case that your chars are really intended to hold characters -- if it's just a raw buffer of some sort, and you really want 256 bytes of storage, regardless of how many (of few) characters that may be, you should probably stick with the malloc(256) and be done with it.

Jerry Coffin
  • 476,176
  • 80
  • 629
  • 1,111
  • huh that's clever. i didn't know sizeof(*type) worked "as expected" – poundifdef Jul 01 '10 at 22:21
  • 3
    @rascher: There's no `sizeof(*type)` there. `ptr` is not a type. It is a pointer variable. – AnT stands with Russia Jul 01 '10 at 22:23
  • 1
    @rascher: Just to clarify, you can use either `sizeof(type)` or `sizeof(expression)`. In the latter case, the compiler deduces the type of expression, but the expression itself is *not* evaluated, so it need not contain valid values (e.g., code that appears to dereference a null or uninitialized pointer is still fine). – Jerry Coffin Jul 01 '10 at 22:28
  • 7
    @Jerry Coffin: It is actually either `sizeof(type)` or `sizeof expression`, meaning that there's no real need to surround the expression in `()` unless one needs to override the precedence. – AnT stands with Russia Jul 01 '10 at 22:32
  • @AndreyT: +1 and some silly old compilers will even complain if you use `sizeof(expr)` as opposed to `(sizeof expr)`. – stinky472 Jul 02 '10 at 08:52
  • 1
    Changing your string type to support internationalization is really backwards. Anything modern uses UTF-8. – R.. GitHub STOP HELPING ICE Jul 02 '10 at 10:45
  • @R: while UTF-8 works nicely as an *external* representation (i.e., for storing data in files), using it internally is a pain. You generally want to convert to UCS-4 as you read, and back to UTF-8 as you write (except on Windows, which uses UTF-16 natively, so that's what you usually want internally). – Jerry Coffin Jul 02 '10 at 14:11
  • 1
    In C, `wchar_t` is a pain because it's platform-dependent. And more importantly, because most libraries (with the notable exception of the Windows API) still require `char*` strings. – dan04 Mar 18 '11 at 22:14
6

The issue should not even exist. You should adopt a more elegant idiom of writing your malloc's as

ptr = malloc(N * sizeof *ptr)

i.e. avoid mentioning the type name as much as possible. Type names are for declarations, not for statements.

That way your mallocs will always be type-independent and will look consistent. The fact that the multiplication by 1 is superfluous will be less obvious (since some people find multiplication by sizeof(char) annoying).

AnT stands with Russia
  • 312,472
  • 42
  • 525
  • 765
4

They're equivalent, but it's good to remain consistent. It also makes it more explicit, so it's obvious what you mean. If the type ever changes, it's easier to find out what code needs to be updated.

Cogwheel
  • 22,781
  • 4
  • 49
  • 67
4

That may be true, but it's only true for that specific case of char.

I personally think it's good form to use the malloc(sizeof(char) * 256) form, because someone changing the type, or copying the code for a similar purpose with a different type may miss the subtleties of that case.

John Weldon
  • 39,849
  • 11
  • 94
  • 127
  • `sizeof(char) * 256` doesn't help you if someone changes the type; do you mean `sizeof *ptr * 256`? – CB Bailey Jul 01 '10 at 22:16
  • @Charles, Agreed, they'd still have to change the type in the declaration, but at least then it'd be explicit, rather than just a number. – John Weldon Jul 01 '10 at 22:18
2

While there's nothing technically wrong with writing sizeof(char), doing so suggests that the author is not familiar with C and the fact that sizeof(char) is defined as 1. In some projects I've worked on, we actually grep for instances of sizeof(char) as an indication that the code might be low-quality.

On the other hand, ptr = malloc(count * sizeof(*ptr)); is a very useful documentation and bug-avoidance idiom, and makes sense even if sizeof(*ptr) is 1. However, it needs to be preceded by if (count > SIZE_MAX/sizeof(*ptr)) { /* handle overflow */ } or you have a serious bug. This can be especially relevant when allocating arrays of wchar_t or complex structures of the same length as an input string, for example when converting a UTF-8 string to wchar_t string or building a DFA to match the string.

R.. GitHub STOP HELPING ICE
  • 208,859
  • 35
  • 376
  • 711
  • It's worth noting that even on systems where a char is larger than 8 bits (I've worked with one where it was 16), sizeof(char) is still one; sizeof(int) is also one in that case. I've even heard of systems where sizeof(long) is also one, since a 'char' holds 64 bits. – supercat Aug 13 '10 at 15:46
1

Yes, they are technically equivalent. It's just a matter of style - using sizeof for every allocation makes you less likely to miss it when you really do need it.

Mark Ransom
  • 299,747
  • 42
  • 398
  • 622