7

While working with arrays in any language, I have always wondered why does the base address or index number of an array start with zero.

int x[5]={21,34,55,314,45};

now if I want to access any the first value of an array I would have to use x[0], but why 0 is there any logic behind this?

Yu Hao
  • 119,891
  • 44
  • 235
  • 294
Amrit
  • 135
  • 1
  • 5
  • This article explains it fairly well: https://skillcrush.com/2013/01/17/why-programmers-start-counting-at-zero/ – Edwin Finch Jan 08 '17 at 06:30
  • http://softwareengineering.stackexchange.com/questions/70612/why-are-structs-arrays-zero-based – Amir Afghani Jan 08 '17 at 06:32
  • 3
    Not **any** language, e.g, Lua arrays start with index `1`. – Yu Hao Jan 08 '17 at 06:34
  • Possible duplicate of [Why does the indexing start with zero in 'C'?](http://stackoverflow.com/questions/7320686/why-does-the-indexing-start-with-zero-in-c) – ares Jan 08 '17 at 06:36
  • 1
    "...my perfectly reasonable compromise of .5-indexed arrays has so far not taken hold..." – Jon Kiparsky Jan 08 '17 at 06:37
  • See http://stackoverflow.com/questions/1499749/list-of-1-indexed-programming-languages, and/or https://en.wikipedia.org/wiki/Comparison_of_programming_languages_%28array%29#Array_system_cross-reference_list. –  Jan 08 '17 at 07:32

5 Answers5

4

In C, the name of an array is essentially a pointer, a reference to a memory location, and so the expression array[n] refers to a memory location n-elements away from the starting element. This means that the index is used as an offset. The first element of the array is exactly contained in the memory location that array refers (0 elements away), so it should be denoted as array[0]. Most programming languages have been designed this way, so indexing from 0 is pretty much inherent to the language as most of the languages (not all) follow C standards. You can refer this link for more details.

DnA
  • 727
  • 3
  • 10
  • 28
1

In general, the logic of working with zero-indexed arrays is considered to be simpler, once you think of the indices as offsets from the origin rather than as "places in a list": you start at the start of the list, and the first item is where you are when you start, ie, when you've gone zero steps.

In a sense, though, it's fundamentally arbitrary and probably convention is the best answer. If we consider that most modern languages were designed by people who learned C very early on in their careers, or more recently who based their designs on those C-inspired languages, the choice of zero-indexed arrays would have been a pretty big change which would have required a lot of justification, which nobody seems to have found, or possibly even looked for. However, if there were ever a real reason to use 1-indexed arrays, there wouldn't be any real reason not to use them.

Of course, as we get into more modern language design the idea of caring about the index of an array element is receding from relevance. For example, python programmers loop over an index like this:

for element in lst:
  do_stuff_to(element)

and therefore don't have a reason to care about the index. Ruby has even hipper ways of dealing with this problem, which I won't illustrate here but which you should look into.

Jon Kiparsky
  • 7,499
  • 2
  • 23
  • 38
1
  1. When working with sub-sequences of natural numbers, the difference between the upper bound and the lower bound should be the length of the sub-sequence. The indices of an array can be thought of as a special kind of such a sub-sequence.

  2. The lower bound should be inclusive, the upper bound should be exclusive. In other words, the lower bound should be the first index of the array. Otherwise, we risk having to have a lower bound in the unnatural numbers for some sub-sequences.

  3. If we want to maintain conditions (1) and (2), then we effectively have two choices for upper and lower bounds: 1 <= i < N+1 or 0 <= i < N. Clearly, putting N+1 in the range is ugly, so we should prefer indexing starting from 0.

Dreamer
  • 112
  • 1
  • 14
1

Just to build upon the other answers, the index is an offset. When you create an array, the program lays out enough consecutive memory locations to "fit" an array of that size. In C and related languages, the variable is storing a pointer to the first item.

Assume that each location is 32 bits (which may or may not be true) and a first address of, say, 200. (Yes, I do realize that it's probably terrible to do this in decimal, but bear with me). The first item is at address (200 + (32 * 0)) = 200. The second item is at address 200 + (32 * 1) = 232. The third item is at address 200 + (32 * 2) = 264. Etc.

The main reason you can access arbitrary items in the array in constant time is that you can just do pointer arithmetic to find any arbitrary item in the array, which can be done in constant time.

1

Computing often - but not always - involves modular arithmetic since we are doing operations to data items held in contiguous blocks of memory that is itself laid out physically in arrays of set width and/or height or in sectors or cylinders. After reaching the memory element at the extreme column we want our incrementor to return to the first element on the next row.

Yes, we can do this via:

Increment

i :=  (i % n)  + 1

Decrement

i :=  (n +  i - 2) % n  + 1

But it is clearer and more efficient to do it with indices starting at zero:

Increment

i :=   (i + 1) % n  

Decrement

i :=   (n +  i - 1) % n 
Trunk
  • 742
  • 9
  • 24