4

Is there any advantage to using pointer notation over array notation? I realize that there may be some special cases where pointer notation is better, but it seems to me that array notation is clearer. My professor told us that he prefers pointer notation "because it's C", but it's not something he will be marking. And I know that there are differences with declaring strings as character arrays vs declaring a pointer as a string - I'm just talking about in general looping through an array.

  • Possible duplicate of [Char array vs Char Pointer in C](http://stackoverflow.com/questions/10186765/char-array-vs-char-pointer-in-c) – DYZ Jan 28 '17 at 23:17
  • Simply take some time and read about pointers. At first, you won't see any diffrence, just more confusing stuff, but later on, you will be amazed on how many things they can do. Simplest example that I can think of: You can reallocate the memory (thus, change the size) of what pointer points to. On the other hand, once you create a simple array, it's size cannot be modyfied – Fureeish Jan 28 '17 at 23:20
  • @DYZ that's about char arrays vs. char pointers. I know there is a difference for that, I just want to know if there's any practical reason for using pointer notation for things like looping through an int array. – Troy Nechanicky Jan 28 '17 at 23:21
  • 2
    Don't worry about it. Use whichever you think makes your code cleaner. They notations are pretty much equivalent. With the pointer notation, you can have fewer variables. It can compile to code that's very slightly slower or faster than what you'd get with the index notation. This fluctuates with different processors, but the difference is usually very very small. – Petr Skocik Jan 28 '17 at 23:24
  • @Fureeish In my class we were told to first create arrays using array notation, then use pointers to navigate them. But I think the relevance of your point to me is that for right now array notation is fine (beginner C course). – Troy Nechanicky Jan 28 '17 at 23:25
  • @TroyNechanicky, array notation is a great beginner practice. It teaches you the basics of iteration, storing and taking care of problems regarding minimal/maximal size. Pointers are just more advanced. They can be used as arrays but they can do a lot more. Just study and reseach on stuff that seems unclear. If you need just an array which size is constant and you want to just iterate through it, there is no need of it being an allocated pointer – Fureeish Jan 28 '17 at 23:29
  • If you can solve your problem using array notation, and you are more comfortable with this, do so. When your code is working, you might then try to write it using pointer notation. Once you get used to pointers you may find that some problems seem more natural in array notation, while others seem more natural in pointer notation. – ad absurdum Jan 28 '17 at 23:52
  • @Fureeish You cannot change the size, but you can calculate the size. With pointers you have to save it somewhere. – cpatricio Jan 28 '17 at 23:56

2 Answers2

5

If you write a straightforward loop, both array and pointer forms typically compile to the same machine code.

There are differences in especially non-constant loop exit conditions, but it only matters if you are trying to optimize the loop for a specific compiler and architecture.

So, how about we consider a real world example that relies on both?

These types implement a double-precision floating-point matrix of dynamically determined size, with separate reference-counted data storage:

struct owner {
    long          refcount;
    size_t        size;
    double        data[];    /* C99 flexible array member */
};

struct matrix {
    long          rows;
    long          cols;
    long          rowstep;
    long          colstep;
    double       *origin;
    struct owner *owner;
};

The idea is that when you need a matrix, you describe it using a local variable of type struct matrix. All data referred to is stored in dynamically allocated struct owner structures, in the C99 flexible array member. After you no longer need the matrix, you must explicitly "drop" it. This allows multiple matrices refer to the same data: you can even have separate row, column, or diagonal vectors, with any change to one immediately reflected in the all others (because they refer to the same data values).

When a matrix is associated with data, either by creating an empty matrix, or by referring to existing data referred to by another matrix, the owner structure refcount is incremented. Whenever a matrix is dropped, the referred to owner structure refcount is decremented. The owner structure is freed, when the refcount drops to zero. This means you only need to remember to "drop" each matrix you used, and the data referred to will be correctly managed and released as soon as possible (unneeded), but never too early.

This all assumes a single-threaded process; multithreaded handling is quite a bit more complicated.

To access element in matrix struct matrix m, row r, column c, assuming 0 <= r < m.rows and 0 <= c < m.cols, you use m.origin[r*m.rowstep + c*m.colstep].

If you want to transpose a matrix, you simply swap m.rows and m.cols, and m.rowstep and m.colstep. All that changes, is the order in which the data (stored in the owner structure) is read.

(Note that origin points to the double which appears at row 0, column 0, in the matrix; and that rowstep and colstep can be negative. This allows all kinds of weird "views" to the otherwise dull regular data, like mirrors and diagonals and so on.)

If we did not have the C99 flexible array member -- say, we only had pointers, and no array notation at all --, the owner structure data member would have to be a pointer. It would mean an additional redirection at the hardware level (slowing down the data accesses a bit). We would either need to allocate the memory pointed by data separately, or use tricks to point to an address following the owner structure itself, but suitably aligned for a double.

Multidimensional arrays do have their uses -- basically, when the sizes of all dimensions (or all but one dimension) are known --, and it's nice for the compiler to take care of the indexing, but it does not have to mean they are always easier than methods using pointers. For example, in the above matrix structure case, we can always define some helper preprocessor macros, like

#define MATRIXELEM(m, r, c)  ((m).origin[(r)*(m).rowstep + (c)*(m).colstep])

which admittedly has the downside that it evaluates the first parameter, m, three times. (It means that MATRIXELEM(m++,0,0) would actually try to increment m three times.) In this particular case, m is normally a local variable of struct matrix type, which should minimize surprises. One could have e.g.

struct matrix m1, m2;

/* Stuff that initializes m1 and m2, and makes sure they point
   to valid matrix data */

MATRIXELEM(m1, 0, 0) = MATRIXELEM(m2, 0, 0);

The "extra" parentheses in such macros ensure that if you use a calculation, for example i + 4*j as row, the index calculation is correct ((i + 4*j)*m.rowstep and not i + 4*j*m.rowstep). In preprocessor macros, those parentheses are not really "extra" at all. In addition to ensure the correct calculation, having the "extra" parentheses also tell other programmers that the macro writer has been careful in avoiding such arithmetic-related bugs. (I for one consider it "good form" to put the parentheses there, even in cases where they are not needed for syntax unambiquity, if it conveys that "assurance" to other developers reading the code.)

And this, after all this text, leads to my most important point: Some things are easier expressed and understood by us human programmers using array notation than pointer notation, and vice versa. "Foo"[1] is pretty obviously equal to 'o', whereas *("Foo"+1) is not nearly as obvious. (Then again, neither is 1["foo"], but you can blame the C standardization folks for that.)

Based on the examples above, I consider the two notations complementary; they do have large overlap especially in simple loops -- in which case it is okay to just pick one --, but being able to utilize both notations and pick one not based on ones proficiency in one but based on ones opinion on as to what makes most sense wrt. readability and maintainability, is an important skill for any C programmer, in my not very humble opinion.

Nominal Animal
  • 38,216
  • 5
  • 59
  • 86
2

Actually if you, say, pass an array argument to a function in C, you actually pass a pointer to its beginning. This doesn't really passes an array in a common sense, first, because passing an array would include passing its actual length, second, because passing an array (as a value) would imply its copying. In other word, you really pass an iterator pointing to an array beginning (like std::vector::begin() in C++) but you pretend that you pass the array itself. It's very confusing in fact. So, using pointers represents things those are really happening in a much more clear way, and it definitely should be preferred.

There may be some advantages of array notation too but I don't think they overweight the drawbacks mentioned. First, using array notation emphasizes the difference between pointer to a single value and pointer to continuous block. And then, you may specify an expected size of passed array for your own reference. But that size isn't actually passed to expressions or functions or somehow checked which fact is very confusing.

AndreyS Scherbakov
  • 2,674
  • 2
  • 20
  • 27