18

Consider the following small example code:

#include <stdio.h>
#include <stdlib.h>

int main()
{
    int *i;
    char *c1, *c2;

    i = malloc(4);

    *i = 65535;

    c1 = i;
    c2 = (char *)i;

    printf("%p %p %p\n", i, c1, c2);
    printf("%d %d", *c1, *c2);

    free(i);

    return 0;
}

In the example, I allocate memory to store an integer, which is pointed by i. Then, I store the value 65535 (1111 1111 1111 1111) in *i. The next thing I do is make two char* pointers also point to the integer. I do it two times, but in two different ways: c1 = i; and c2 = (char *)i;. Finally, I print all pointers and all values in the screen. The three pointers are pointing to the same address, and the two values *c1 and *c2 are correct (-1).

However, the compiler generates a warning in this line: c1 = i;. The warning is generated because I did not use the (char *) cast to do the assignment.

What I would like to ask is why the compiler generates this warning, since I do not see any difference in using c1 = i; or c2 = (char *)i;. In both cases, the result is the same address with the same size in bytes. And this is valid for all casts, even if it is a (int *) cast, (float *) cast, (short *) cast, etc. All of them generate the same value, but the compiler will only accept it without a warning if the cast being used is of the pointer's type.

I really would like to know why the compiler asks for that cast, even if the result is always the same.

gsamaras
  • 71,951
  • 46
  • 188
  • 305
felipeek
  • 1,193
  • 2
  • 10
  • 31
  • 9
    In C, the type system is a compile-time 'tool' that prevents you from performing invalid operations. Things of different types have different operations available to them, and the compiler is making sure you're only using valid operations. But some operations meant for one type may work just fine for another type, so when you want to use those you can perform a type-cast, which basically tells the compiler: "I know what I'm doing, just pretend that this thing here is of the type that's valid for this operation". – Pieter Witvoet Jan 05 '15 at 18:35
  • Does the compiler not give you a warning on i = malloc(4), as well? – Steven Jan 05 '15 at 18:59
  • 3
    @Steven Actually no, because the malloc returns a void pointer. Take a look at http://stackoverflow.com/questions/605845/do-i-cast-the-result-of-malloc?rq=1 – felipeek Jan 05 '15 at 19:03
  • @globalstatic ahh yes C not c++ my bad – Steven Jan 05 '15 at 19:06
  • That `i` is a pointer is driving my brain crazy. Reminds me of the [IOCC](http://www.ioccc.org/). – Sergey Orshanskiy Jan 06 '15 at 14:27
  • Just as a note: You do realize that `int *i = malloc(4);` will lead to undefined behaviour if `sizeof(int)>4` (happens on some 64bit platforms)? Use `malloc(sizeof(int))`. – sleske Jan 06 '15 at 15:16
  • @sleske I've read that, in C, an integer is ALWAYS four bytes, without any platform dependency. I would use sizeof(int) in a real application just to be safer, but please someone correct me if I'm wrong – felipeek Jan 06 '15 at 16:45
  • @globalstatic: Yes, on common desktop platforms (Windows, Linux, Mac OS X) an `int` is four bytes. However, the C standard does not guarantee this (it only guarantees "16 bit or more"), and there are other platforms where has a different size. The same goes for the other standard types (`short`, `long` etc.) so one should avoid assuming anything about their size, other than the minimum size from the C standard. – sleske Jan 08 '15 at 08:44
  • related http://stackoverflow.com/questions/34826036/confused-about-pointer-dereferencing?noredirect=1&lq=1 – Suraj Jain Jan 12 '17 at 10:38

6 Answers6

33

When you use:

c2 = i;

the compiler warns you about the assignment of type int* to char*. It could potentially be an inadvertent error. The compiler is warning you with the hope that, if it is indeed an inadvertent error, you have the chance to fix it.

When you use:

c2 = (char *)i;

you are telling the compiler that you, as the programmer, know what you are doing.

R Sahu
  • 204,454
  • 14
  • 159
  • 270
  • 6
    About the last sentence, that's the theory; in practice, it's seen more often as a mean used by incompetent programmers to shut up errors they don't understand (most often with function pointers). Related: [If you have to cast, you can't afford it](http://blogs.msdn.com/b/oldnewthing/archive/2009/10/23/9911891.aspx). – Matteo Italia Jan 05 '15 at 23:32
  • 3
    @MatteoItalia: I think that "used by incompetent programmers to shut up errors they don't understand" might be a proper subset of "telling the compiler that you, as the programmer, know what you are doing." (Just because the incompetent programmers *say* they know what they're doing, it doesn't mean . . .) – ruakh Jan 06 '15 at 06:24
  • @MatteoItalia Well, IMO, it is also the practice, if you *really* know what you are doing. C is not an easy to use language at its full extent. I wouldn't recommend it as a first language for anyone (and maybe neither as the second, unless you really need it). The problem is that sometimes *it is* taught as a first language, but usually omitting the things that make it powerful and their drawbacks (undefined behavior, sequence points, casting, type punning, pointer arithmetic, etc.). People (and some teachers too) still think C and C++ are the same language, except that C++ has more features! – LorenzoDonati4Ukraine-OnStrike Jan 06 '15 at 12:38
  • @LorenzoDonati, I know lots and lots of people who learned C/C++ (or Pascal) as the first language, and they seem to be doing very well. During my time tutoring programming, I met people who learned Java or Python as the first language, and they often seem to not understand basic things about e.g. pointers and memory organization. At least, for anyone who wants to become a professional programmer, so far I always recommend C++ as a first language. Otherwise I recommend to not learn C++ at all. – Sergey Orshanskiy Jan 06 '15 at 14:18
  • @osa Well, not always a programmer needs to know so much the innards of the machine to do their job well, otherwise functional programming or web design would be already dead. If by programmer you mean "system programmer", or any programmer who needs to cope with performance issues, I agree. But not every programmer has those needs nowadays. Programming has become an activity that encompasses so may different levels and fields of expertise that knowing what a stack frame is is no more so pressing, IMO. ... – LorenzoDonati4Ukraine-OnStrike Jan 06 '15 at 15:24
  • @osa ... This is not to say that using Java as your first programming language is *guaranteed* to make you a good programmer :-) There are bad programmers everywhere, despite the language they use! :-D – LorenzoDonati4Ukraine-OnStrike Jan 06 '15 at 15:25
  • Sir , months back i wrote a question http://stackoverflow.com/questions/34826036/confused-about-pointer-dereferencing?noredirect=1&lq=1 , it was not edited good , maybe that is why it got many downvotes , but now for about half a yera i have edited it good , still it is not gaining attention and due to downvotes i am unable to ask any question . Please help me – Suraj Jain Jan 12 '17 at 10:40
17

Your program is invalid; assignment between pointer types is not permitted when the pointed to types are incompatible types, unless a cast is used.

6.5.4 Cast operators

3 - Conversions that involve pointers, other than where permitted by the constraints of 6.5.16.1, shall be specified by means of an explicit cast.

The compiler is accepting your code as an extension, but another compiler might reject your code and would be correct to do so. This is why the compiler is issuing a warning when you omit the cast.

ecatmur
  • 152,476
  • 27
  • 293
  • 366
  • The compiler is automatically doing a (char *) cast in my program? – felipeek Jan 05 '15 at 18:46
  • @globalstatic yes, but it is not required to. It *is* required to issue a "diagnostic" (i.e. a warning or error). – ecatmur Jan 05 '15 at 18:48
  • @ectamur Got it. But this cast is strictly necessary only because it is written in the C standards or because the cast really do something useful? I'm still a bit confused, because the (int *) pointer that I created in the code is exactly the same as the two (char *) pointers that I created after. And both of them were casted correctly to (char *), following your words. I start to think that the pointer cast is necessary just because of C standards, but when the assignment is made, what the compiler do is just copy the original adress to the new pointer, without any changes. Am I wrong? – felipeek Jan 05 '15 at 19:12
  • 1
    @globalstatic: The C language can be compiled for some very weird CPU types. It is completely legal to have a CPU in which characters and integers use completely different memory spaces. In which case accessing via the wrong pointer would be very expensive and require extra machine code. – Zan Lynx Jan 05 '15 at 19:51
  • @globalstatic: On common CPU types like ARM and x86 the different types in C only change the width of the memory access used in the machine code. The pointer values are the same. – Zan Lynx Jan 05 '15 at 19:52
  • @ZanLynx Oh, I guess I got it. So, depending on the machine, the address of the pointer can be changed after a casting? – felipeek Jan 05 '15 at 20:16
  • @globalstatic the *address* doesn't change (it's still the same location in virtual memory), but the *representation* of the address might change. Think of the difference between int and long, which have the same representation on some architectures and different representations on others. – ecatmur Jan 05 '15 at 20:21
  • @ecatmur What is the reason to change the representation of the address, on the same machine, when casting from one type to another? This is what is still not clear to me. – felipeek Jan 05 '15 at 20:27
  • 2
    @ecatmur: I don't have an example of a machine, but I can imagine one and reasons for one, that stores values of different sizes in different memory blocks. That hypothetical machine would have to write different machine code to read a value through a cast. And actually, some DSPs can't store any value smaller than 16 or 32 bits, and on those a char access has to read the whole value and then shift and mask. So yes, sometimes the address DOES change on a cast. – Zan Lynx Jan 05 '15 at 21:58
  • 1
    @ecatmur: And in C++ (not C, but close) a cast can definitely return a different address. A cast to the second base class pointer when there are multiple base classes is not the same address as the original pointer. – Zan Lynx Jan 05 '15 at 22:02
  • could someone see this http://stackoverflow.com/questions/34826036/confused-about-pointer-dereferencing?noredirect=1&lq=1 – Suraj Jain Jan 12 '17 at 10:38
9

Pointers to objects of different types may differ in their respective size and alignment:

§6.2.5/26 Types

A pointer to void shall have the same representation and alignment requirements as a pointer to a character type. Similarly, pointers to qualified or unqualified versions of compatible types shall have the same representation and alignment requirements. All pointers to structure types shall have the same representation and alignment requirements as each other. All pointers to union types shall have the same representation and alignment requirements as each other. Pointers to other types need not have the same representation or alignment requirements.

Especially, sizeof(char *) may have a different value than sizeof(int *). Furthermore, char * may have different alignment requirements than int *. Hence, the implicit cast yields a warning. That's why the C standard contains the following guideline:

§6.5.4 / 1

Conversions that involve pointers, other than where permitted by the constraints of 6.5.16.1, shall be specified by means of an explicit cast.

Philip
  • 5,795
  • 3
  • 33
  • 68
  • ALL pointers ALWAYS have the SAME size (4 bytes in 32-but system, 8 bytes in 64-bit system). The pointed type may differ in size, but pointers don't. – cruelcore1 Jan 05 '15 at 18:32
  • 8
    @cruelcore1: Not true. According to the C standard, pointers to different types may differ in representation (read: size) and alignment. – Philip Jan 05 '15 at 18:38
  • @Philip: IMHO, the Standard should define a standard macro to recognize categories of implementations that support pointer semantic features that would be practical and useful on many, but not all, implementations. Not all platforms can support a "pointer to any pointer" type, but on platforms that support such a type (often by treating `void**` in such fashion), code using it can sometimes be cleaner and more efficient than would otherwise be possible. – supercat Jul 12 '18 at 21:03
6

Compiler only knows about converting a generic pointer void * to any pointer and vice-versa implicitly. Other than generic pointer, it expects a cast before pointer assignment to make pointers of compatible type and decides that the assignment is done knowingly.

haccks
  • 104,019
  • 25
  • 176
  • 264
  • Sir , i asked a [question](http://stackoverflow.com/questions/34826036/confused-about-pointer-dereferencing/41349954) months back , and then i was new on site so did not edited it well , and also in this site beginner are not encouraged to ask question , i got many downvotes and was banned on asking question for a long duration , maybe 6 months , , i have edited this question months back ago , but it did not recieved much attention , if you would just look at it , and if there is any problem with the questio and answer below , it would be of great help – Suraj Jain Dec 29 '16 at 11:36
  • @SurajJain; Hi. I have visited the question. Your question downvoted badly doesn't mean community do not encourage beginners to ask question on SO. Down vote depends on the quality of question asked. I read answers there and John Bode's answers explains the question in details. I would say to read his answer and if you find any difficulties then you can ask him regarding your queries in the comment section. – haccks Dec 29 '16 at 11:48
  • I then edited my question but then no visitor come , this seriously disencourage a person to ask more question , also what i asked was a genuine doubt many beginner would have , in john bode answer he says something about printf , could you confirm it right or wrong , can %d be used to print char defined or undefined behaviour , also below john bode answer , there is my answer , i wrote it long back , can you check if it contains any error , i do not want to mislead anyone. – Suraj Jain Dec 29 '16 at 11:52
  • @SurajJain; John's answer is correct. You can't use `%d` to print any data type other than `int`. His answer is correct. – haccks Dec 29 '16 at 11:57
  • But char a re default promoted to integer and so isn't %d defined , i have asked others too ouah says it is correct according to him , but i should post another question asking this which i am afraid to do. – Suraj Jain Dec 29 '16 at 11:59
  • See This http://stackoverflow.com/questions/35496014/type-casting-char-pointer-to-integer-pointer/35496066#35496066 – Suraj Jain Dec 29 '16 at 12:00
3

In more complex solutions there may be problems if you don't do that. It doesn't cause problems in this case, but when the compiler needs to treat the pointed memory location by its type, it may treat it wrong. Take note that char is 1 byte long, while int is 4 or 8 bytes long.

cruelcore1
  • 578
  • 4
  • 22
3

Unfortunately the best answer is in the comments to the question!

See @PieterWitvoet commment:

In C, the type system is a compile-time 'tool' that prevents you from performing invalid operations. Things of different types have different operations available to them, and the compiler is making sure you're only using valid operations. But some operations meant for one type may work just fine for another type, so when you want to use those you can perform a type-cast, which basically tells the compiler: "I know what I'm doing, just pretend that this thing here is of the type that's valid for this operation".

In case you still not get the point: C is a statically typed language. What this means is that the programmer should define the type of the identifiers before using them. Thus, one of the purpose of a C compiler is to ensure that you have assigned a type to all the identifiers. Another purpose would be to ensure that values assigned are of the same type as that of the identifier to which they are assigned. In case of pointers, they are internally of same type i.e.., they are 32-bits on 32-bits systems. However, the high level aspect of the pointers are that they are of pointers to a certain 'type' of data. As such it makes sense to ensure that a pointer is be assigned an address of a data that is of the expected type. I vaguely remember that such a check didn't originally exist and was added in some later release of the C standard. However, I could be wrong. If such a check is not in place then one could end up storing the address of wrong data and this would materialize as a run time error. For example assigning the address of a struct to an int/char/any other type and then indirecting in to the same assuming that it is of the concerned struct type could lead to unexpected problems at run time. It is not that such assignments are invalid, as such, the compiler warns you and if you type cast the value then the compiler would not worry about it because it then takes it that you know what you are doing.

RcnRcf
  • 356
  • 1
  • 8