-1

Since there's no such thing as an array in the C language, is the following all stored in one memory location, or is each element's value stored in an "array" of memory locations?

int array[] = {11, 13, 17, 19};
  • Scenario 1

    {11, 13, 17, 19} --> location A
    
  • Scenario 2

    {
        11 --> location A
        13 --> location B
        17 --> location C
        19 --> location D
    }
    

Which one is the valid memory layout?

Sourav Ghosh
  • 133,132
  • 16
  • 183
  • 261
dbconfession
  • 1,147
  • 2
  • 23
  • 36
  • 5
    `Since there's no such thing as an array in the C language`..who told you? – Sourav Ghosh Nov 01 '16 at 16:33
  • 1
    Where did you get the information that there is no array in C? – Ken White Nov 01 '16 at 16:35
  • 2
    What do you mean by "one location"? Obviously the individual `int`s can't all be sharing the same set of bits. The memory locations are contiguous, if that's what you're trying to ask. – Kyle Strand Nov 01 '16 at 16:36
  • lynda.com tutorial: "Up and Running C" Section 5.5: Accessing arrays with pointers (00:10s) – dbconfession Nov 01 '16 at 16:36
  • "*Accessing arrays with pointers*" If one can "access" something, this thing obviously needs to exist. – alk Nov 01 '16 at 16:38
  • @KyleStrand This answers what I assumed. Thank you for clarifying. Answer and I will mark as such – dbconfession Nov 01 '16 at 16:38
  • 3
    The tutorial states " All arrays are simply shorthand for pointers". This is actually a common *misconception* (and it's disheartening to see it stated in the tutorial), but note that the tutorial itself then goes on to refer to "arrays" as though they do in fact exist--which they do. – Kyle Strand Nov 01 '16 at 16:39
  • @alk not true from my understanding. Just as "strings" don't actually exist in C, but are merely an array of chars. – dbconfession Nov 01 '16 at 16:39
  • Thanks, but I think UrielEli and Arun have it covered! – Kyle Strand Nov 01 '16 at 16:39
  • Beware, there are people calling an N-dimensional jagged array just an array, although it's multiple arrays ... – alk Nov 01 '16 at 16:40
  • 1
    A string is not "merely an array of `char`s," but a null-terminated array of `char`s. – ad absurdum Nov 01 '16 at 16:42
  • A C-"strings" is a concept (which indeed exists), it is a `char` array with at least one element carrying a `'\0'`, which indeed exist. – alk Nov 01 '16 at 16:42
  • Array's are not "first class citizens" of the C type system, since they cannot be passed as arguments to a function. (Off the top of my head, I actually can't think of any way other than `sizeof` to actually do anything with an *array* as such, as opposed to working with pointers.) This may help you understand: http://stackoverflow.com/a/1642423/1858225 (The top-voted answer may also be helpful, since it explains how arrays *decay* into pointers, which is the source of the above-mentioned misconception.) – Kyle Strand Nov 01 '16 at 16:48
  • Augh I can't believe I wrote "array's" as the plural of "array" above. Bleeeeeargh – Kyle Strand Nov 01 '16 at 21:06

5 Answers5

7

C explicitly defines "array" as a type.

Quoting C11, chapter §6.2.5, Types (emphasis mine)

An array type describes a contiguously allocated nonempty set of objects with a particular member object type, called the element type. The element type shall be complete whenever the array type is specified. Array types are characterized by their element type and by the number of elements in the array. An array type is said to be derived from its element type, and if its element type is T, the array type is sometimes called ‘‘array of T’’. The construction of an array type from an element type is called ‘‘array type derivation’’.

In a nutshell, the answer is, array elements are stored in separate but contiguous locations.

Let's suppose we have declared an array of 5 int:

int arr[5];

Then, on a platform where the size of an integer is 2 bytes (szeof(int) ==2), the array will have its elements organized like this:

enter image description here

On a different platform, where sizeof(int) == 4, it could be:

enter image description here

So the representation

{
    11 --> location A
    13 --> location B
    17 --> location C
    19 --> location D
}

is valid, considering B == A + 1, C == B + 1 and so on.

Here, please note, the pointer arithmetic regards the data type, so A+1 will not result in an address with 1 byte increment, rather the increment is by one element. In other words, the difference between the address of two consecutive element will be the same as the size of the datatype (sizeof (datatype)).

Toby Speight
  • 27,591
  • 48
  • 66
  • 103
Sourav Ghosh
  • 133,132
  • 16
  • 183
  • 261
  • This answer is a little misleading because it give the impression that there are no gaps between elements, which is both true and false. Also we need to be careful when we write `C == B + sizeof (datatype)` without specifying the type of `B`. Also it would be useful to quote the definition of `sizeof`. – user3528438 Nov 01 '16 at 16:57
  • 2
    @user3528438 I didn't write this answer, but I don't understand your objections. "there are no gaps between elements, which is both true and false"--what do you even mean by this? It's not correct, because the *standard specifies* the allocation is "contiguous", so indeed there are no gaps, unless by "gaps" you include *padding*. But padding is *part of* each element of the array, not *between* the elements. – Kyle Strand Nov 01 '16 at 17:07
  • 1
    `A`, `B`, and `C` are "locations" according to the question, so obviously the "type" is some kind of pointer. The math using `sizeof` is thereofre neither confusing nor incorrect. Also, I'm not sure that a quoted definition of `sizeof` would be relevant here; if someone doesn't know what `sizeof` is, they can just look it up. – Kyle Strand Nov 01 '16 at 17:08
  • @user3528438 1) which gaps? 2) is not it like array stores same type of element? so why the specific type of `B` makes a special difference? – Sourav Ghosh Nov 01 '16 at 17:14
  • 1
    just to reiterate with what @KyleStrand mentioned, `sizeof` is a pretty self-describing name, IMHO. Description of `sizeof` can be found in any textbook/ tutorial and would be a little off-topic here. – Sourav Ghosh Nov 01 '16 at 17:16
  • 1) `sizeof` is not self-describing at all, just take a look at how many people are estimating the range of short/int/long.... using sizeof*CHAR_BITS; 2) Yes by gaps I mean padding, so if the listener considers padding as part of the object, then there's no gaps, otherwise (he/she considers) there is. 3) if you say type of `A` `B` `C` is some kind of pointer then the type of that pointer does affect the result of that expression, otherwise you still need to explicitly specify they are some kind of integer type and how to convert to them from original pointers. – user3528438 Nov 01 '16 at 17:23
  • @user3528438 Well, I don't oppose you, but, 1) in that case, they did not bother to read the available documentation for `sizeof`, at all. That is the problem of the one who's using that, not the naming. :) 2) Padding is supposed to be part of the object (element), isn't it? 3) Yes, in case of different arrays, the type does matter, but for the element of a single array, it should not matter (change). _"Array types are characterized by their element type and by the number of elements in the array"_, right? – Sourav Ghosh Nov 01 '16 at 17:29
  • You are making the same mistake as Uriel Eli in his answer below. 1) `B == A + sizeof (datatype)` is an expression, the result of which depends on the type of `A` and `B`, 2) when `B` and `A` are `int*` and `datatype` is `int` and `sizeof(int)` is 4, you have: `A + sizeof (datatype)` points to `array[4]` but `B` points to `array[1]`; 3) the expression evaluates to `true` of `B` and `A` are both `char*` or an integer type. – user3528438 Nov 01 '16 at 18:16
  • 2
    @user3528438 Ahh, I see your point now, thanks for your efforts. Changing it now. – Sourav Ghosh Nov 01 '16 at 18:18
  • @user3528438 Any better now? – Sourav Ghosh Nov 01 '16 at 18:22
  • I'll settle here but when I saw you mentioned "difference between the address" I'm thinking about `ptrdiff_t`, like `B-A` is 1. But anyway, I've had my share of fun here. Thanks for your detailed answer. – user3528438 Nov 01 '16 at 18:36
3

The elements would be in contiguous memory location.

Let array[0] is at location B and the size of each element of the array, i.e. sizeof(int), is S. Then we have this

array[0] at B
array[1] at B + S
array[2] at B + 2S
..
array[n] at B + n*S
Arun
  • 19,750
  • 10
  • 51
  • 60
1

The compiler allocates the array in specific, contiguous locations.

You can also check it up with the next code:

#include <stdio.h>

void main()
{
    int array[] = {11, 13, 17, 19};
    for (int i = 0; i < 4; i++)
        printf("0x%p ", &array[i]);
}

That gives the hexadecimal addresses

0x14fee0 0x14fee4 0x14fee8 0x14feec

with the margin of 4 bytes per element, the size of int.


Generally, you can take the pointer to one element of the array, say index m, and add it n as a number of elements, and get the pointer to the n+m index in the array.

*(array + n) == array[n]
Uriel
  • 15,579
  • 6
  • 25
  • 46
  • 3
    That printf should be `printf("%p ", (void *)&array[i]);`. – Ian Abbott Nov 01 '16 at 16:40
  • Thanks. Does one memory location hold 1 bit? And if so, wouldn't the array in my OP, hold 32 memory locations (1x8x4)? I understand something if off here. I'd just like clarification before I move. – dbconfession Nov 01 '16 at 16:44
  • 2
    @dbconfession memory addresses indicate 1 *byte*, which is 8 *bits*. – Uriel Nov 01 '16 at 16:46
  • 1
    @UrielEli: The C-Standard leaves open the number of bits per `char`. Its just refers to `CHAR_BIT`, which for most current systems is `8` , yes. – alk Nov 01 '16 at 16:47
  • @UrielEli It's not about the leading zeros, it's about making the printf format specifiers compatible with the corresponding arguments. If you insist on using `%x` (which will truncate the printed address on x86-64 architecture, for example), you should at least cast the corresponding argument to `unsigned int`. – Ian Abbott Nov 01 '16 at 16:49
  • @dbconfession yes – Uriel Nov 01 '16 at 16:52
  • @IanAbbott: Better cast to `uintptr_t` from ``. – alk Nov 01 '16 at 16:55
  • 1
    @alk: Yes, but you still need to make the printf format specifier match the argument type. (I'm sure you know that, but I don't want to confuse the OP.) – Ian Abbott Nov 01 '16 at 16:58
  • @UrielEli This may be getting off topic, but physically what is an address (1 bit)? Is one `0` or one`1` a representative of one open or closed transistor state? e,g, does `10110010` represent 8 open/closed transistor logic gates? – dbconfession Nov 01 '16 at 17:04
  • 1
    @dbconfession An address can be thought of as a number that indexes a location in memory. That location will contain at least 8 bits (at least from C's point of view). On your typical desktop machine, it will hold exactly 8 bits. On more exotic hardware, it may hold more than 8 bits. – Ian Abbott Nov 01 '16 at 17:15
  • 2
    @dbconfession Per UrielEli's first comment above, an address refers to 8 bits, not 1 bit (some rare systems have other numbers of bits per byte, but it's never 1-bit per byte). In general, yes, `10110010` physically means 8 open/closed logic gates, though there's quite a few levels of abstraction between C-language pointers and physical logic gates, so it's not necessarily perfectly true that a C-pointer refers to an actual physical memory location with 8 bits. – Kyle Strand Nov 01 '16 at 17:16
  • @UrielEli The `0x` in your `0x%p ` printf format string may produce confusing output. On my machine (GNU/Linux) the `%p` produces `0x` by itself, so the printed address starts `0x0x....`. Also, the `%p` printf format specifier expects a matching `void *` argument. You are passing an `int *` argument instead. You generally get away with it, but technically, this results in _undefined behavior_. – Ian Abbott Nov 01 '16 at 17:32
  • @KyleStrand Ok so 1 mem. loc. is a series of 8 open/closed gates (1 byte), correct? – dbconfession Nov 01 '16 at 17:34
  • 1
    @dbconfession 1 mem.loc. is a series of _at least_ 8 open/closed gates (bits). By definition in C, a memory location holds one byte, but the number of bits in a byte is implementation defined to some value >= 8. (On a typical desktop system, there will be exactly 8 bits in a C byte.) This definition of a byte in C differs from the typical (modern) everyday usage of "byte" as a group of 8 bits. (Because of the ambiguity in usage of the term "byte", Internet standards use the unambiguous term _octet_ to refer to a group of 8 bits.) – Ian Abbott Nov 01 '16 at 17:52
  • 1
    @dbconfession Yes, each memory location is a unique byte, but per my comment on a different answer, a pointer of a specific multi-byte type (e.g. `int*`, a pointer to an int) is actually pointing at *multiple* bytes, and adding `1` to the *pointer* will add `sizeof(int)` to the underlying pointer value. – Kyle Strand Nov 01 '16 at 18:38
1

C does have an array type. Just because you can access arrays via pointers doesn't mean they don't exist.

Array elements are stored in contiguous memory locations starting from the address "array" (i.e. the base address of array which is also the address of the first element of the array) and each element of the array is addressable separately.

Assuming 4 byte ints, the array int array[] = {11, 13, 17, 19}; would look like:

+-----+-----+-----+-----+
|  11 |  13 |  17 |  19 |
+-----+-----+-----+-----+
  ^     ^     ^      ^
0x100 0x104  0x108  0x112

You can probably understand better with a simple program:

#include <stdio.h>

int main(void)
{
int array[] = {11, 13, 17, 19};

/* all will print the same value */
printf("Base address of array: %p, %p, %p\n", (void*)array, (void*)&array[0], (void*)array);

for (size_t i = 0; i < sizeof array/sizeof array[0]; i++) {
      printf("address of array[%d]: %p\n", i, (void*)&array[i]);
}

return 0;
}

One important detail is that though the addresses &array[0] and &array are the same value, their types are different. &array[0] is of type int* (pointer to an int) whereas &array is of type int(*)[4] (pointer to an array of 4 ints).

P.P
  • 117,907
  • 20
  • 175
  • 238
  • In the output of your code, how many bits does the address `0x7fff50b7ea20` (on my console) hold? – dbconfession Nov 01 '16 at 16:58
  • @dbconfession The representation of the address printed by printf (for `%p`) is *implementation-defined*. That means, there's no standard number of bits/representation. It may vary on different systems. – P.P Nov 01 '16 at 17:02
  • `&array[0]` and `&array` can not compare, hence can not compare equal, so how do you know if they are the same value? And also how do you define "value" of them? – user3528438 Nov 01 '16 at 17:02
  • @user3528438 What do you mean by "can not compare, hence can not compare equal"? "so how do you know if they are the same value?" - Because the C standard defines as such. By *value* I mean the addresses are the same but how they are interpreted by C's type is different (as explained at the bottom of my answer). – P.P Nov 01 '16 at 17:07
  • @dbconfession Addresses point to a specific *byte*, which (on practically all modern platforms) consists of 8 *bits*. But the object that *starts* at that address (in this case an `int`) may be larger than a single byte, so if an `int` is at location `0x7fff50b7ea20`, then assuming a 32-bit `int type, there are 32 bits being "pointed at" by a pointer with that address value. – Kyle Strand Nov 01 '16 at 17:12
  • @P.P. Still, you are saying they are the same value but not specifying a certain way to compare them. Also in you don't think to agree with me on the definition of "same value" (my opinion: two objects have same value if they compare equal) but you didn't give your scientific definition of "same value" in your answer/comments. – user3528438 Nov 01 '16 at 17:16
  • @user3528438 You can compare them with: `if ((void*)&array[0] == (void*)&array) puts("Equal");`. With this, I hope, you'd be able to understand my "scientific" definition. – P.P Nov 01 '16 at 17:22
  • @P.P. I was actually hoping you to use `void*` as the medium of comparison, because I know it's UB. `char*` and `(u)intptr_t` is better. – user3528438 Nov 01 '16 at 17:27
  • @P.P. so since mem. locations are physically right next to each other, is it correct to say that to move where a pointer is pointing, I would advance by the desired element indexes and then manipulate how i choose? e.g. `int array[] = {1, 2, 7, 4};` `int *ptr;` `ptr = array;` (this points to index 0) `ptr = ptr+2;` *ptr = 3;` effectively changing the 3rd element's value from `7` to `3`? – dbconfession Nov 01 '16 at 17:30
  • @user3528438 You are going to provide some evidence for your UB claim. `char*` and `void*` are required to have exactly the same representation and alignment. So, I don't know how it's "better". – P.P Nov 01 '16 at 17:32
  • 1
    @dbconfession Yes but **only if** the memory locations are *contiguous* for the object in question. It's true for arrays but not universally true. – P.P Nov 01 '16 at 17:34
  • @P.P. N1570, 6.5.9.2. `char*` is better because it fits into 6.5.9.2(2) and conversion to `char*` is well defined in 6.3.2.3.7. – user3528438 Nov 01 '16 at 18:06
  • @user3528438 Are you suggesting conversion to `void*` is *not* well defined then? If that's case, `if (NULL == NULL) {..}` would be UB since `(void*)0)` is a valid definition of NULL. – P.P Nov 01 '16 at 18:20
  • No. I'm saying 6.3.2.3.7 establishes such relationship between the two `char*`s so that 6.5.9.6 is true. And the standard explicitly listed NULL as special case in 6.5.9.2 and 6.5.9.6. – user3528438 Nov 01 '16 at 18:28
0

Since there's no such thing as an array in the C language

There is totally such a thing as an array in the C language. All of your examples are C arrays.

The difference you are describing is the difference between a list and an array.

Arrays in C, as indeed in most languages, are like your Scenerio 1.

You could certainly accomplish your Scenerio 2 with an array of pointers to values. for example

int array1[] = {11, 14, 17, 19};
// vs 
int* array2[] = {
    &array1[0],
    &array1[1],
    &array1[2],
    &array1[3]
};

A list however is quite different in organization.

struct list_node{
    int value;
    struct list_node * next;
};

struct int_list {
    int length;
    struct list_node * first;
};

int main(){
    int i;
    struct list_node nodes[4];
    struct int_list list1 = {.length = 4, .first=&nodes[0]};
    for (i = 0; i < 4; i++){
        nodes[i].value = array1[i];
        if (i != 3){
            nodes[i].next = &nodes[i+1];
        } else {
            nodes[i].next = NULL;
        }
    }

    // traverse the list.
    struct list_node * n = list1.first;
    while(n != NULL){
        printf("%d\n", n->value);
        n = n->next;
    }
} 
Mobius
  • 2,871
  • 1
  • 19
  • 29
  • Thanks. Please pardon my ignorance, but does each memory location store 1 bit? And if so, how can multiple int values be stored into one memory location? – dbconfession Nov 01 '16 at 16:42
  • @dbconfession: There are different sizes of memory locations. If you have an array of `char`, each element holds one `char`. If you have an array of `int`, each element holds one `int`. For most practical purposes (_pace bit fields_) there isn't a way to access storage locations for single bits; the smallest addressible unit is a `char`. So no, memory locations always store multiple bits. Until you're dealing with structures, the size is normally a multiple of 8 bits (CHAR_BIT; can be larger), and the multiple is itself a power of 2 (so 1 x 8 bits; 2 x 8 bits; 4 x 8 bits; 8 x 8 bits; etc). – Jonathan Leffler Nov 01 '16 at 16:47
  • That makes sense. I'm getting various answers above. Does my example take up 32 bits of memory since its an array of ints and not chars? – dbconfession Nov 01 '16 at 16:55
  • It will depend on the size of an `int` on your system. I believe an int is 4 bytes, or 32 bits, so the size of the total array would be 16 bytes. You can always use `sizeof(thing)` to find out the size (in bytes) of `thing` – Mobius Nov 01 '16 at 17:02
  • 1
    @dbconfession On modern operating systems (e.g. an x86 machine running Windows 95, Mac OS-X, or Linux) a `char` is 8 bits wide, an `int` is 32 bits wide and `sizeof(int)` is 4. The C standard requires an `int` to support values in the range -32767 to +32767, so it needs to be at least 16 bits wide. – Ian Abbott Nov 01 '16 at 17:07