1

If I need to write a function that returns an array: int*, which way is better?

  1. int* f(..data..)

or: void f(..data..,int** arr)

and we call f like this: int* x; f(&x);. (maybe they are both the same but I am not sure. but if I need to return an ErrorCode(it's an enum) too, then in the first way f will get ErrorCode* and in the second way, f will return an ErrorCode).

  • 1
    This is nearly assuredly not long for this world due to the highly opinionated nature of the question, but... Depends on whether you care whether your function succeeded or not. Returning a pointer means you have to build that status into the result (NULL, not-NULL). An in/out param frees up the return value for such status reporting. – WhozCraig Jul 22 '14 at 18:23
  • 3
    Functions cannot return arrays. You're speaking of returning a pointer. – chris Jul 22 '14 at 18:40

7 Answers7

5

Returning an array is just returning a variable amount of data.
That's a really old problem, and C programmers developed many answers for it:

  1. Caller passes in buffer.
    1. The neccessary size is documented and not passed, too short buffers are Undefined Behavior: strcpy()
    2. The neccessary size is documented and passed, errors are signaled by the return value: strcpy_s()
    3. The buffer size is passed by pointer, and the called function reallocates with the documented allocator as needed: POSIX getline()
    4. The neccessary size is unknown, but can be queried by calling the function with buffer-length 0: snprintf()
    5. The neccessary size is unknown and cannot be queried, as much as fits in a buffer of passed size is returned. If neccessary, additional calls must be made to get the rest: fread()
    6. The neccessary size is unknown, cannot be queried, and passing too small a buffer is Undefined Behavior. This is a design defect, therefore the function is deprecated / removed in newer versions, and just mentioned here for completeness: gets().
  2. Caller passes a callback:
    1. The callback-function gets a context-parameter: qsort_s()
    2. The callback-function gets no context-parameter. Getting the context requires magic: qsort()
  3. Caller passes an allocator: Not found in the C standard library. All allocator-aware C++ containers support that though.
  4. Callee contract specifies the deallocator. Calling the wrong one is Undefined Behavior: fopen()->fclose() strdup()->free()
  5. Callee returns an object which contains the deallocator: COM-Objects
  6. Callee uses an internal shared buffer: asctime()

Be aware that either the returned array must contain a sentinel object or other marker, you have to return the length separately, or you have to return a struct containing a pointer to the data and the length.
Pass-by-reference (pointer to size or such) helps there.

In general, whenever the user has to guess the size or look it up in the manual, he will sometimes get it wrong. If he does not get it wrong, a later revision might invalidate his careful work, so it doesn't matter he was once right. Anyway, this way lies madness (UB).

For the rest, choose the most comfortable and efficient one you can.

Regarding an error code: Remember there's errno.

Community
  • 1
  • 1
Deduplicator
  • 44,692
  • 7
  • 66
  • 118
  • This is a lovely answer to the question in the headline but the text of the question indicates that the OP meant something else ... to return a pointer to an array, presumably malloced within the function. Perhaps you could address that as well. Also, since there's no C++ tag, I don't think that std::shared_ptr should be mentioned. – Jim Balter Jul 23 '14 at 00:29
  • @JimBalter: Well, the OP is searching for the best way to return his stuff, and he's trying to settle on the best interface. The ideas he had are mentioned too... Anyway, I cannot decide for him... – Deduplicator Jul 23 '14 at 00:34
1

Usually it's more convenient and semantic to return the array

int* f(..data..)

If ever you need complexe error handling (e.g., returning errors values), you should return the error as an int, and the array by value.

Antzi
  • 12,831
  • 7
  • 48
  • 74
1

There is no "better" here: you decide which approach fits the needs of the callers better.

Note that both functions are bound to give a user an array that they allocate internally, so deallocating the resultant array becomes a responsibility of the caller. In other words, somewhere inside f() you would have a malloc, and the user who receives the data must call free() on it.

You have another option here - let the caller pass the array into you, and return back a number that says how many items you put back into it:

size_t f(int *buffer, size_t max_length)

This approach lets the caller pass you a buffer in a static or in the automatic memory, thus improving flexibility.

Sergey Kalinichenko
  • 714,442
  • 84
  • 1,110
  • 1,523
1

the classic model is (assuming you need to return error code too)

int f(...., int **arr)

even though it doesnt flow so nicely as a function returning the array

Note this is why the lovely go language supports multiple return values.

Its also one of the reasons for exceptions - it gets the error indicators out of the function i/o space

pm100
  • 48,078
  • 23
  • 82
  • 145
1

The first one is better if there is no requirement to deal with an already existent pointer in the function. The second one is used when you already have a defined pointer that points to an already allocated container (for example a list) and inside the function the value of the pointer can be changed.

Vlad from Moscow
  • 301,070
  • 26
  • 186
  • 335
0

If you must call f like int* x; f(&x);, you do not have much of a choice. You must use the second syntax, i.e., void f(..data..,int** arr). This is because you are not using return value anyways in your code.

milaniez
  • 1,051
  • 1
  • 9
  • 21
0

The approach depends on a specific task and perhaps on your personal taste or a coding convention adopted in your project.

In general, I'd like to pass pointers as "output" parameters instead of return'ing an array for a number of reasons.

  1. You likely want to return a number of elements in the array together with the array itself. But if you do this:

    int f(const void* data, int** out_array);
    

    Then if you see the signature first time, you can't quite tell what the function returns, the number of elements, or an error code, so I prefer to do this:

    void f(const void* data, int** out_array, int* out_array_nelements);
    

    Or even better:

    void f(const void* data, int** out_array, size_t* out_array_nelements);
    

    The function signature must be self-explanatory, and the parameter names help to achieve that.

  2. The output array needs to be stored somewhere. You need to allocate some memory for the array. If you return a pointer to the array without passing the same pointer as argument, then you can't allocate memory on the stack. I mean, you cannot do this:

    int f (const void *data) {
        int array[10];
        return array;  /* the array is likely deallocated when the function exits */
    }
    

    Instead, you have to do static int array[10] (which is not thread-safe) or int *array = malloc(...) which leads to memory leaks.

    So I suggest you to pass a pointer to the array which is already allocated before the function call, like this:

    void f(const void *data, int* out_array, size_t* out_nelements, size_t max_nelements);
    

    The benefit is you are free to choose where to allocate the array:

    On the stack:

    int array[10] = { 0 };
    size_t max_nelements = sizeof(array)/sizeof(array[0]);
    size_t nelements = 0;
    
    f(data, array, &nelements, max_nelements);
    

    Or in the heap:

    size_t nelements = 0;
    size_t max_nelements = 10;
    int *array = malloc(max_nelements * sizeof(int));
    f(data, array, &nelements, max_nelements);
    

    See, with this approach you are free to choose how to allocate the memory.

Dmitry
  • 1,995
  • 1
  • 11
  • 10