-1

I want to create an ArrayList, or Vector in C. Perhaps I can learn if I'm on the right path or completely off base

So if I have an ArrayList Struct which contains an array, initially set to 10, and a counter to keep track of how many elements have been filled in the arraylist, like such

typedef struct ArrayList
{
     int counter;
     int arr[10];
}

Could the array arr, be replaced by another array that is twice the size of the original array? If so, how would I do that?

I have the following piece of code in an add() Function

if ( arrList->counter == (sizeof(arrList->arr)/sizeof(int))  )
{
     int tempArray[((arrList->counter + 1) * 2)];
     for (int i = 0; i < arrList->counter; i++)
     {
          tempArray[i] = arrList->arr[i];
     }
     strcpy( arrList->arr, tempArray );
}

Am I on the right path or is there a better way to create a growable array?

Joshua
  • 19
  • 1
  • 3
  • 4
    Replace the array with a pointer, and dynamically allocate/grow/delete the array with `malloc`, `realloc`, and `free`. – Tom Karzes Dec 17 '18 at 20:44
  • Or consider using a flexible array member: `typedef struct ArrayList { size_t capacity; size_t counter; int arr[]; } ArrayList;` which allows you to keep a record of how much space is allocated in `capacity`, how much is in use in `counter` (as before), and you can reference `vector->arr[i]` as before. This is almost equivalent to use `int *arr;` in the structure. There are pros and cons to both. (You can use `realloc()` to expand or shrink the FAM; some care is required.) – Jonathan Leffler Dec 17 '18 at 20:49
  • 4
    `strcpy` does not properly work for int arrays. Use `memcpy` instead. – Stephan Lechner Dec 17 '18 at 20:50
  • @JonathanLeffler You can't grow a flexible array member, you have to realloc the structure itself. – Barmar Dec 17 '18 at 21:20
  • @Barmar: apologies for sloppy talk — that was what I meant (in part by "some care"), but what I wrote can far too easily be misunderstood. – Jonathan Leffler Dec 17 '18 at 21:20
  • There are indeed both pros and cons to both potential representations of the vector elements, but I think those of an FAM are going to be hard to appreciate for an inexperienced C programmer such as the OP seems to be. I'd certainly recommend against the FAM for anyone who does not understand very well all the tradeoffs involved in that choice. – John Bollinger Dec 17 '18 at 21:38

1 Answers1

1

Your approach is not that good as it involves the structure being copied to a temporary primitive array.

A very nice solution to the question you have is something called flexible array. Google it for more info. It essentially looks like this.

struct my_fam_array_t {
    int arr_size;
    int arr[];
};

To use it you always declare it as a pointer as follows

struct  my_fam_array_t *arr5;

And then you initialise it as follows:

arr5 = malloc(sizeof(struct my_fam_array_t) + fam_size));

Where fam_size should be the size you require for the array.

Two more things, don't forget to set the size of the array in the structure and also check the memory allocation status and handle any errors.

Don't forget to free it up after you use it.

You can now use the array normally such as

arr5->arr[3] = some_value;

You can now create a function to resize the array. I'm not going to write it but what is shall do is:

malloc a new struct with the new size

Copy the old array into the new struct

Don't forget to free up the old array

This may be new for you considering you are a beginner, however flexible arrays is the tool you need.

By the way, you can use realloc as well, have a read about it, it may be useful, however be very careful to handle the out of memory failure correctly. Google about realloc and SEI CERT C Standards

Edit:

Make sure you use C99 or above. Contrary, you'll have to use the structure with the array size of 1 and you shall subtract 1 from the malloc size. You may use the array with size 0 if your compiler supports this extension.

John
  • 1,012
  • 14
  • 22
  • [`int arr[0]` is not valid in C](https://port70.net/~nsz/c/c11/n1570.html#6.7.6.2p1); this is an old `struct` hack used to achieve a [flexible array member](https://port70.net/~nsz/c/c11/n1570.html#6.7.2.1p18). FAMs were formally added in C99. – ad absurdum Dec 18 '18 at 14:55
  • @David Bowling That is true, however my belief is that the author uses C99 or above already. Should that not be the case, the array could be modified to arr[1]. This will be compatible with all C versions. Also make sure to subtract 1 from the malloc size to account for the 1 element. – John Dec 18 '18 at 14:59
  • No. `int arr[0]` is not legal C, even in C99. The way to declare a FAM in C99 and above is with `int arr[]`. The old `struct` hack `int arr[0]` was used before C99 to achieve the same thing, but this was not supported by the Standard and relied on implementation details. – ad absurdum Dec 18 '18 at 15:01
  • @DavidBowling my bad, you're right again. Array of size 0 is an extension and not part of the standard. C99 FAM should have no size specified to become an incomplete type. – John Dec 18 '18 at 15:05
  • Suggestions: you might change `LENGTH` to something like `fam_sz` to make more clear that this is a size and not a length. Consider `arr5 = malloc(sizeof *arr5 + fam_sz);`: no need to cast the result of `malloc()` in C, and using an expression rather than an explicit type with `sizeof` is less error-prone and easier to maintain. This is a more idiomatic way to use `malloc()` in C. Also, don't forget that you need to check for successful allocation before using `arr5`. – ad absurdum Dec 18 '18 at 15:16
  • @DavidBowling To be honest, I believe the opposite is true regarding the sizeof remark. I have gone to the lengths of banning my team from using a sizeof an expression and rather use a type. You may get in trouble by specifying an expression, what if it overflows? What if by mistake a pointer is passed? Maybe I might be missing something but at least the SEI CERT AC Standards and the Japanese C coding standards agree with me. – John Dec 18 '18 at 15:33
  • it can't overflow in a `sizeof` expression, since `sizeof` does not (usually) evaluate its operand (only the type of the expression is used). Overflow can happen in the rest of the expression evaluation, but this has nothing to do with `sizeof *arr5` vs. `sizeof some_type_I_have_to_get_right`. Don't know what you mean by "a pointer is passed." Where does CERT agree with you (I'd be interested in a link)? – ad absurdum Dec 18 '18 at 15:40
  • @DavidBowling Here is the SEI Cert standard link: https://wiki.sei.cmu.edu/confluence/display/c/MEM02-C.+Immediately+cast+the+result+of+a+memory+allocation+function+call+into+a+pointer+to+the+allocated+type . There go to section "Compliant Solution (Hand Coded)" and read until "Exceptions" (or at least the first half of that selection). From the Japanese coding standards, refer to Rule R3.6.3 (here is the PDF https://www.ipa.go.jp/files/000040508.pdf). See next comment – John Dec 18 '18 at 16:19
  • @DavidBowling Essentially, my points where that you can mistakenly use sizeof(some_type *) instead of sizeof(some_type). Also, the sizeof(some_big_value + some_big_value) may overflow which in turn may render an incorrect type due to the conversion rank change. I do understand the benefits of having sizeof(*arr5) as you don't need to re-specify the type when arr5 changes, however, it is dangerous and in this case Reliability precedes Maintainability. – John Dec 18 '18 at 16:19
  • `sizeof(some_big_value + some_big_value)` can _never_ overflow: [`sizeof` does not evaluate its operand, except in the case that the operand has VLA type.](https://port70.net/~nsz/c/c11/n1570.html#6.5.3.4p2) The example `sizeof(some_type *)` is using a type, not an expression, and is exactly what the idiomatic approach avoids. Instead, `int *arr = malloc(sizeof *arr);` uses the _type_ only of the expression `*arr`, which is `int`. Later changes to `long *arr = malloc(sizeof *arr);` require the type change in only one location: easier to maintain and less error-prone to start. – ad absurdum Dec 18 '18 at 16:27
  • This seems both more reliable and easier to maintain to me, and to many others, but thanks for the links; I'll check them out. I note that your first link seems to be about casting the result of `malloc()`, not using an expression in the manner I described. [There is some legitimate disagreement about this casting among programmers.](https://stackoverflow.com/questions/605845/do-i-cast-the-result-of-malloc) But this is only a CERT recommendation, not a rule, and one of low importance at that. – ad absurdum Dec 18 '18 at 16:39
  • @DavidBowling I thought I'd add an example of what I meant by sizeof overflow: #include int main(void) { unsigned char a = 200; unsigned char b = 200; printf("a + b -> %u\n", sizeof(a + b)); printf("a -> %u\n", sizeof(a)); printf("b -> %u\n", sizeof(b)); printf("unsigned char -> %u\n", sizeof(unsigned char)); return 0; } sizeof(a + b) is greater than 1 which is (possibly) not the intention. This kind of overflows cause trouble and are quite difficult to spot. Primarily due to this I am discouraged to use sizeof(expression). – John Dec 18 '18 at 16:54
  • There is no overflow in that code (it does have UB since `size_t` values must be printed with `%zu`, not `%u`). `sizeof(a+b)` correctly reports the size of `int` on the target system, which is the type of the expression `a + b` due to integer promotion rules. Of course, it is a bad idea to do `sizeof(a+b)` in such a case, and why would you want to? But, this is not at all related to the `sizeof *arr` example. Integer promotion can bite you in many different circumstances, but that is no reason to avoid clean code. – ad absurdum Dec 18 '18 at 17:07
  • @DavidBowling I'm with you on the point of having clean code (and using sizeof(*arr) is pretty neat). But from a QA perspective (at least from my experience of writing standards) it avoids having that integer promotion circumstance. Again, from experience of reviewing codebases, there are developers out there, sadly, who encapsulate big expressions, with side effects, in a sizeof, which cause sudden memory over-allocation which further cause (at least embedded) devices to fail. By the way, some of them say they did it to make clean code (good intention, however, unsuccessful). – John Dec 18 '18 at 17:22