16

I think I understand the semantics of pointer arithmetic fairly well, but I only ever see examples when dealing with arrays. Does it have any other uses that can't be achieved by less opaque means? I'm sure you could find a way with clever casting to use it to access members of a struct, but I'm not sure why you'd bother. I'm mostly interested in C, but I'll tag with C++ because the answer probably applies there too.

Edit, based on answers received so far: I know pointers can be used in many non-array contexts. I'm specifically wondering about arithmetic on pointers, e.g. incrementing, taking a difference, etc.

Shea Levy
  • 5,237
  • 3
  • 31
  • 42
  • This is a good site.. see the part on struct. I find that part interseting: http://oopweb.com/CPP/Documents/ObjectsFirst/Volume/pointer_arith.html –  Sep 28 '11 at 00:16
  • See my answer for what's essentially the only well-defined non-array use. (But even it uses arrays as the underlying mechanism.) – R.. GitHub STOP HELPING ICE Sep 28 '11 at 00:30

7 Answers7

12

Pointer arithmetic by definition in C happens only on arrays. However, as every object has a representation consisting of an overlaid unsigned char [sizeof object] array, it's also valid to perform pointer arithmetic on this representation. For example:

struct foo {
    int a, b, c;
} bar;

/* Equivalent to: bar.c = 1; */
*(int *)((unsigned char *)&bar + offsetof(struct foo, c)) = 1;

Actually char * would work just as well.

R.. GitHub STOP HELPING ICE
  • 208,859
  • 35
  • 376
  • 711
  • Aaah, offsetof was the operator I was missing for a reliable way to access struct members. So this would be useless for accessing members of an opaque struct type, right? – Shea Levy Sep 28 '11 at 00:33
  • 3
    Well even with an opaque structure type, the library could provide a function that returns the offset of a particular member, and you would then be justified in accessing it with this method using the returned offset. Note that this would allow the layout of the structure to change without breaking existing binaries using the (I'm assuming it to be shared) library. – R.. GitHub STOP HELPING ICE Sep 28 '11 at 00:37
  • Good point, though I'm not sure why you'd do that instead of just a standard library function that returns that particular member when given a structure... Maybe to reduce the number of function calls if you need to access it multiple times? – Shea Levy Sep 28 '11 at 00:46
  • @SheaLevy - Usually you wouldn't use this to access the members of an opaque struct, because it's unsafe (because of breakage with new versions of the library) and breaks your implicit contract with the library, but if you felt that you needed to, this would allow you to do it. (You can even use `offsetof(struct { int a, b, c; void *d; }, d)` if you know the `struct`'s internal layout but don't have the actual definition to "fake" the correct offset, or just an integer for maximum hackiness) – Chris Lutz Sep 28 '11 at 01:56
  • @R..GitHubSTOPHELPINGICE Do you know which part of the standard states that "every object has a representation consisting of an overlaid unsigned char [sizeof object]"? I've only found this quote in C11 n1570 draft (6.3.2.3.7): "When a pointer to an object is converted to a pointer to a character type, the result points to the lowest addressed byte of the object. Successive increments of the result, up to the size of the object, yield pointers to the remaining bytes of the object.". It states only about increments, and doesn't really guarantee that it's an actual array. – cubuspl42 Oct 11 '20 at 12:47
  • @cubispl42: 6.2.6, 6.5 ¶7. – R.. GitHub STOP HELPING ICE Oct 11 '20 at 13:45
11

If you follow the language standard to the letter, then pointer arithmetic is only defined when pointing to an array, and not in any other case.

A pointer may point to any element of an array, or one step past the end of the array.

Greg Hewgill
  • 951,095
  • 183
  • 1,149
  • 1,285
  • 1
    What's the purpose of the "one step past the end of the array" scenario? – Shea Levy Sep 28 '11 at 00:10
  • 1
    Also, do you by any chance know which section of which C standard says this? – Shea Levy Sep 28 '11 at 00:13
  • 5
    It's so that you can have a useful terminating condition when iterating through an array. If this were not possible, you'd have to have special-case code to handle the case of an empty array everywhere (since it wouldn't be valid to point *anywhere*). Allowing "one step past the end" lets you point to the end of an empty array. – Greg Hewgill Sep 28 '11 at 00:13
  • 3
    Section 6.5.6.8 of the [C99 standard](http://www.open-std.org/jtc1/sc22/WG14/www/docs/n1256.pdf): "If both the pointer operand and the result point to elements of the same array object, or one past the last element of the array object, the evaluation shall not produce an overflow; otherwise, the behavior is undefined." – Greg Hewgill Sep 28 '11 at 00:17
  • 1
    @SheaLevy - To allow `for(ptr = arr; ptr != &arr[sizeof arr / sizeof arr[0]]; ptr++)` to be legal. It's legal to _create a pointer to_ one past the end, but not to dereference it (the standard specifies that `&arr[0]` and `&*ptr` are not dereferences). – Chris Lutz Sep 28 '11 at 00:18
  • @GregHewgill: Is there a way to test if a given pointer points one past the end of an array? – Shea Levy Sep 28 '11 at 00:43
  • The only way to test is if you already know both the array starting location, and the length of the array, beforehand. – Greg Hewgill Sep 28 '11 at 01:15
  • "the standard specifies that &arr[0] and &*ptr are not dereferences" - this is tagged both C and C++. I may be wrong, but I vaguely recall that this is specified in C99 but not in C++, or at least not explicitly in C++. – Steve Jessop Sep 28 '11 at 01:50
9

From the top of my head I know it's used in XOR linked-lists (very nifty) and I've seen it used in very hacky recursions.

On the other hand, it's very hard to find uses since according to the standard pointer arithmic is only defined if within the bounds of an array.

Community
  • 1
  • 1
orlp
  • 112,504
  • 36
  • 218
  • 315
  • 1
    That answer is one of the greatest hacks I've ever seen. – Chris Lutz Sep 28 '11 at 00:16
  • Yeah xored linked lists are a genius trick. Though I'm not sure if this is "pointer arithmetic" by definition - after all you need a normal integer to do that stuff. If we do include that stuff I can think of one large group: Use the lower bits of a pointer as a tag group. Assume that the pointer is always 2^n aligned and you can use n bits for personal storage - the hotspot JVM uses that extensively for example. – Voo Sep 28 '11 at 01:39
  • Wow, this is indeed one of the greatest hacks I've ever seen. I can imagine using it in a `malloc` implementation to reduce the minimum size of chunks in the freelists to two pointer-size fields (size and XOR pointer). By the way, I'm not sure why XOR is so special here; addition or subtraction would work just as well. – R.. GitHub STOP HELPING ICE Sep 28 '11 at 02:15
  • @R..: Using XOR, you can reuse the same code for traversing the list in either direction. With subtraction you need different code for going left/right. – hammar Sep 28 '11 at 02:58
5

a[n] is "just" syntactic sugar for *(a + n). For lulz, try the following

int a[2];
0[a] = 10;
1[a] = 20;

So one could argue that indexing and pointer arithmetic are merely interchangeable syntax.

mpartel
  • 4,474
  • 1
  • 24
  • 31
  • Sure, all array indexing is pointer arithmetic. My question is, is all (useful) pointer arithmetic array indexing? – Shea Levy Sep 28 '11 at 00:09
  • 3
    -1, This is completely irrelevant, the question is about OTHER uses of pointer arithmic than in arrays. – orlp Sep 28 '11 at 00:12
  • IMHO this argues that from a certain point of view, there is no other pointer arithmetic since pointer arithmetic is essentially the same as treating something as an array and indexing it. – mpartel Sep 28 '11 at 00:20
  • @mpartel: Well your wrong (see my answer), and it's the other way around: indexing an array is essentially the same as pointer arithmic. – orlp Sep 28 '11 at 06:50
  • @nightcracker: doesn't "essentially the same as" go both ways? :) – mpartel Sep 28 '11 at 08:47
  • 1
    `a[n]` is syntactic sugar for `*(a + n)`, not `a + n`. – Karu Feb 15 '14 at 23:08
1

Pointer arithmetic is only defined on arrays. Adding an integer to a pointer that does not point to an array element produces undefined behavior.

markgz
  • 6,054
  • 1
  • 19
  • 41
  • 2
    Technically, pointer arithmetic is not defined in terms of arrays; but consecutive locations. Adding 3 to a pointer will advance the pointer by 3 locations; not necessarily associated with an array. – Thomas Matthews Sep 28 '11 at 00:15
  • 1
    Per the standard they **are** associated with an array. Note that, as each object is an array of length one of objects of its own type, it's always valid to add one to the address of an object and obtain the pointer "one element past the end of the array". This is important for example if you pass `&c, 1` (where `c` has type `char`) to a function expecting `char *, size_t`, and the function uses the "one past" rule as part of its loop logic. – R.. GitHub STOP HELPING ICE Sep 28 '11 at 00:39
1

In embedded systems, pointers are used to represent addresses or locations. There may not be an array defined. (Although one could say that all of memory is one huge array.)

For example, a stack (holding variables and addresses) is manipulated by adding or subtracting values from the stack pointer. (In this case, the stack could be said to be an array based stack.)

Thomas Matthews
  • 56,849
  • 17
  • 98
  • 154
1

Here's a case for pointer arithmetic outside of (strictly defined) arrays:

double d = 0.5;
unsigned char *bytes = (void *)&d;
for(size_t i = 0; i < sizeof d; i++)
    printf("Byte %zu of d is %hhu\n", i, bytes[i]);

Why would you do this? I don't know. But if you want to look at the bitwise representation of an object (useful for things like memcpy and memcmp), you'll need to cast their addresses to unsigned char *s (or signed char *s if you like) and work with them byte-by-byte. (If your task isn't too difficult you can even write the code to work word-by-word, which most memcpy implementations will do. It's the same principle, though, just replace char with int32_t.)

Note that, in the standard, the exact values (or the number of values) that are printed are implementation-defined, but that this will always work as a way to access an object's internal bytewise representation. (It is not required to work for larger integer types, but almost always will - no processor I know of has had trap representations for integers in quite some time).

Chris Lutz
  • 73,191
  • 16
  • 130
  • 183
  • If you mean this is undefined because it violates strict aliasing - it doesn't, since char pointers are allowed. – Voo Sep 28 '11 at 01:41
  • @Voo - I wasn't totally sure what the rules were, and I didn't feel like checking the standard. I assumed it would be fairly defined as to what happened at the machine level (a `char` can't have a trap representation, so all the individual _bytes_ ought to be processed correctly), but that the value of the individual bytes would be undefined (or at least implementation defined). – Chris Lutz Sep 28 '11 at 01:45
  • Behavior is defined (to print the bytes of `d`, whatever they may be), but `sizeof d` and the storage representation of `0.5` are both implementation-defined. I don't think anything is UB or unspecified. `unsigned char*` is a legal alias for every type for exactly this reason. – Steve Jessop Sep 28 '11 at 01:47