5

I was reading about pointers when suddenly I thought that if pointer is nothing but a variable that stores memory address of a variable so every integer should work as a pointer. Then I created a small program, it gave warning but it somehow worked.

int main()
{
    int i,j;
    i=3;
    j=&i;
    printf("%d\n%d\n%d",i,j,&i);
    return 0;
}

Output was

3
1606416600
1606416600

So, why to put an extra * if normal int does the work?

Another question is about the output to following program

int main()
{
    int a[] = {1,2,3,4,5,6,7};
    int *i,*j;
    i=&a[1];
    j=&a[5];
    printf("%d\n%d\n%d",j,i,j-i);
    return 0;
}

Output :

1606416580
1606416564
4

Why is j-i = 4 and not 16?

Kartik Sharma
  • 723
  • 8
  • 19
  • You might want to learn about [BCPL](http://en.wikipedia.org/wiki/BCPL) which was a heavy influence on C. It permits you to use an int (or "machine word" as it calls them) in just the way you describe. C without type safety! – luser droog Sep 22 '13 at 10:04

7 Answers7

16

Why do we need to put * for pointer

Because the language specification says so.

So, why to put an extra * if normal int does the work?

Because "normal" int does not do the work. Nor does an "abnormal" int.

Pointers are a separate type. No wonder the human brain can easily imagine them as indices into a huuuuuge array of bytes called "the memory", but that's not necessarily what computers and compilers do. The C standard says that conversion between pointers and int is an implementation-defined operation.

You can store a pointer without loss of data if you use the built-int types intptr_t or uintptr_t, though -- but neither of those is guaranteed to be an int (or an unsigned int, for that matter).


As to your second question: because that's how pointer arithmetic is defined. And it's defined like so because that's how it's logical and intuitive. If p2 = p1 + 4, then p2 - p1 is 4 and not 16.

See this question for more information about pointer arithmetic.


Oh, and technically, your first program has undefined behavior because printing pointers is done using the %p conversion specifier, but you used %d which is for int. Your first program would be correct like this:

printf("%d\n%d\n%p", i, j, (void *)&i);

(also notice the cast to void * -- this is one of the few cases where a cast to void * is required, else you have UB again.)

Community
  • 1
  • 1
7

It is down to having type safety. I.e. using one thing when it should not be used to do something other.

See http://en.wikipedia.org/wiki/Type_safety

Ed Heal
  • 59,252
  • 17
  • 87
  • 127
3

(Adding to the already good answers of @H2CO3 and @EdHeal.)

At Assembly level you could treat an address as an int and do any sort of dirty tricks with them, but C is a much higher level language than Assembly. What does it mean "high level" in the context of programming languages? It is short for "high level of abstraction", which means that it is a language that is closer to how humans write and think.

In a sense, it's all about "abstractions". Think of a car, for example. You don't need to know all the gory engineering details just to drive it safely. You view a car as a "much higher level abstraction" compared to what a mechanical engineer has to. Why is this useful? Because your brain has the freedom to concentrate on driving you home without being involved in a car accident, instead of being forced to think of, say, how many revolutions per minutes every cogwheel in the engine has to do.

The metaphor is valid also for programming languages: abstractions are useful because they spare you the effort of thinking of every tiny detail of the underlying implementation. A pointer is an abstraction (although not a very high level one, compared to what you find in more modern languages): it is the archetypal model of an indirect reference to something. Under the hood it may be implemented as an address, as a handle or as a whole different thing, but its semantics is described (and mandated) by the standard. Thus you are saved from many problems that are the nightmare of Assembly programmers, especially when switching platform or architecture: pointers also help you make portable programs.

1

Pointers are not always simply integers. They are integers on the vast majority of current implementations, but they can be more complex. An example is implementations that were done for 8086 processors. A simple integer pointer wwould have been limited to only accessing a 64k address space. To cope with this C compilers would implement different memory models. A tiny memory model would use simple integers for pointers and would be limited to a max of 64k for the program code, data and stack combined. A small memory model would also use simple integers but split the code & data into one segment and the stack into another. This allowed for 128k programs. Other memory models would use pointers that consisted of a segment:offset pair of integers which allowed for larger program sizes. The bottom line is that a pointer abstracts out the the concept of a memory location from its implementation.

Mike Stotts
  • 86
  • 1
  • 4
0

Pointers are indeed commonly implemented as memory addresses, and they can in that case be thought of as integers. As you experienced, it is even possible to convert between the two, although you must be careful that the size of the integer type is as big as the size of a memory address (the pointer size).

The reason that a * is used, has to do with type safety. Something of type int* is ‘an address of an integer’, whereas somthing of type float* is ‘the address of a float’. If you were to treat those in the same way, you would lose information about the type of the value at the address.

As for your second question, this is called pointer arithmetic. Address differences will be reported as multiplier of the element size, and not in actual bytes. Because sizeof(int) is 4 in your case, and there is a difference of 16 bytes between the addresses, the result of the operation is 16/4 = 4. The result is the number of elements difference, which is 5 - 1 = 4.

Edit: although H2CO3’s answer is technically correct, I think this explanation is more intuitive.

Ruud
  • 3,118
  • 3
  • 39
  • 51
  • 2
    "Pointers are indeed just memory addresses...", this is an oversimplification, often causing people thinking pointers and ints are interchangeable. The standard (C99 draft N1256) says: *A pointer type may be derived from a function type, an object type, or an incomplete type, called the referenced type. **A pointer type describes an object whose value provides a reference to an entity of the referenced type**. A pointer type derived from the referenced type T is sometimes called "pointer to T". The construction of a pointer type from a referenced type is called "pointer type derivation".* – LorenzoDonati4Ukraine-OnStrike Sep 22 '13 at 10:02
  • “an object whose value provides a reference to an entity of the referenced type” — In other words, an address. For all practical purposes, in all common implementations, these will be memory addresses that can be thought of as integers. – Ruud Sep 22 '13 at 10:06
  • 1
    A reference is not an address, in general. "Reference" is an abstract term, a memory address (either virtual or physical) is a very concrete concept. An address may be used to implement pointers (references), but pointers can be implemented also in different ways (pointers are not always addresses). – LorenzoDonati4Ukraine-OnStrike Sep 22 '13 at 10:11
  • 2
    To be clear, I wasn't arguing your answer is incorrect, but that stating the identity `pointers==addresses` you could mislead the OP, since in this case the question was about "why" you need pointers when you could use integers instead. The fact is, among other reasons, that you cannot *always* use integers. – LorenzoDonati4Ukraine-OnStrike Sep 22 '13 at 10:15
  • I apologise for the terminology here. I meant the more abstract meaning of address, not necessarily a memory address. (Reference can be ambiguous as well in the context of C++.) For example, a postal address is generally not an integer. – Ruud Sep 22 '13 at 10:23
  • 2
    @RuudvA: It is not sufficient that something be true in all common applications for it to be true for practical purposes or good enough to teach people. The uncommon uses are still sufficiently frequent that they are significant. And it is bad for teaching because it gives people wrong ideas and leads to incorrect uses of pointers in code. Even if the underlying machine has a flat address space, the optimizer works in the C model, so it may make transformations valid in that model (pointers to different arrays have no relation to each other) that are not valid in the machine address space. – Eric Postpischil Sep 22 '13 at 10:24
  • 1
    I dug out this [SO thread](http://stackoverflow.com/questions/15151377/what-exactly-is-a-c-pointer-if-not-a-memory-address) that clarifies my point. – LorenzoDonati4Ukraine-OnStrike Sep 22 '13 at 10:24
  • @RuudvA: Your statement that you meant “address” abstractly conflicts with the first statement of this answer, which says addresses may be thought of as integers. – Eric Postpischil Sep 22 '13 at 10:26
  • I rephrased the answer to clarify that it is only in the common case that pointers are represented by memory addesses which can — for common address spaces — be thought of as integers. – Ruud Sep 22 '13 at 10:33
  • 1
    @RuudvA: That is false. In the most common case of C implementations, one is not programming for the machine. One is programming for the C computation model. Even if addresses are integers in the targeted machine, they are not in the C computation model. There are expressions that work if addresses are integers that cannot be relied on in C, because they have undefined behavior and the optimizer may transform them as it wishes. When you are programming in C, addresses are not integers. – Eric Postpischil Sep 22 '13 at 11:30
0

Pointers and integers have a different type because they are two different things, even though pointers are implemented as integers on many architectures. But consider for example the x86_64 architecture, there are implementations where integers are 64 bits wide and pointers are 32 bits wide.

Étienne
  • 4,773
  • 2
  • 33
  • 58
0

Apart from the address representation and type "safety" issue, the specific pointer type (as opposed to a single general pointer type) is required for pointer arithmetic and assignment. (Those are not used in your example.)

Pointer arithmetics:

int intArr[2] = {1, 2};
int* pInt0 = &intArr[0];     // points to intArr[0]
int* pInt1 = pInt0 + 1;      // points to intArr[1]

char* pChar0 = pInt0;       // points to the first  byte of intArr[0]
char* pChar1 = pChar0 + 1;  // points to the second byte of intArr[0]

(see 6.3.2.3/7)

Assignment through a pointer:

int obj = 42;
unsigned char buf[sizeof(obj)];
for(unsigned i = 0; i < sizeof(obj); ++i) {  // like memcpy
    unsigned char* source = i + (unsigned char*)&obj;
    unsigned char* dest = i + buf;
    *dest = *source;    // copies one byte
}

int obj2 = 0;
int* pObj2 = &obj2;

*pObj2 = obj;           // copies sizeof(int) bytes

(see 6.2.6.1/4)

dyp
  • 38,334
  • 13
  • 112
  • 177
  • Essentially, `sizeof(*pInt0)` and `sizeof(*source)` also demonstrate this issue. – dyp Sep 22 '13 at 12:17