5

In the K&R ANSI C book I have stumbled upon a piece of code where a pointer to a function is used. I think I understand the idea behind pointers to functions, but the example presented in the book does not make sense to me.

The function is to sort lines of text using quicksort sorting algorithm. The declaration of quicksort function looks like this:

void qsort(void *lineptr[], int left, int right, int (*comp)(void *, void *);

This so far makes perfect sense to me, we are declaring a function where one of the arguments is a pointer to a function (comp) with two void pointer arguments. I assume we are doing it this way to save memory, and not make copies of actual function.

Later on the function is used in main in this way:

qsort((void **) lineptr, 0, nlines-1, (int (*)(void*, void*))(numeric ? numcmp : strcmp));

Here is where I have some questions:

  1. why is lineptr cast to (void **)
  2. what is the use of using (numeric ? numcmp : strcmp), does it mean that pointers to functions can also be assigned like this: int ( * )(void*, void*) numcmp

EDIT: The lineptr is defined like this: char *lineptr[MAXLINES]; I assume we are casting void** to change it from char to type void pointer

Lover of Structure
  • 1,561
  • 3
  • 11
  • 27
Damian Kowalski
  • 362
  • 2
  • 8
  • 1
    given `qsort` function is used for generic sorting and meaning of `(numeric ? numcmp : strcmp));` is depends on value of `numeric`, either `numcmp` (number comparision) or `strcmp` (string comparision) will be called – IrAM Dec 23 '20 at 13:56
  • Why isn't it done this way then: qsort((void **) lineptr, 0, nlines-1, (int (numeric ? &numcmp : &strcmp)(void*, void*))); – Damian Kowalski Dec 23 '20 at 13:59
  • No, it should be same as what is used in definition _pointer to a function which takes two `void*` arguments and returns `int`_ , check [here](https://cdecl.org/?q=int+%28*comp%29%28void+*%2C+void+*%29) – IrAM Dec 23 '20 at 14:04
  • Also, the [`qsort()` prototype](https://port70.net/~nsz/c/c11/n1570.html#7.22.5.2) is actually `void qsort(void *base, size_t nmemb, size_t size, int (*compar)(const void *, const void *));` NB the `const void *` types instead of just `void *`. That can make a huge difference, especially if you're trying to use `qsort()` from within C++ code. – Andrew Henle Dec 23 '20 at 14:05
  • When you refer to something in a book, include a complete citation including page number. Do not make people expend time hunting for it. – Eric Postpischil Dec 23 '20 at 15:31
  • A closely related (but not identical) question which I asked without awareness of your (this) question: https://stackoverflow.com/q/76241319/2057969 – Lover of Structure Jul 10 '23 at 17:18

4 Answers4

6

The K&R book predates standardized C, so some constructs which were valid at the time are no longer valid. Specifically, it seems to be assuming that conversion between any two pointer types is valid.

lineptr is an array of char *, and this array will decay into a pointer to its first element when used in an expression, i.e. char **. So the cast to void ** is needed to match the argument type, as only a conversion to/from void * may be performed without a cast.

The expression (numeric ? numcmp : strcmp) is choosing a function to pass as the comp parameter to the function, so either numcmp or strcmp depending on the value of numeric. The cast to (int (*)(void*, void*)) is needed because strcmp (and presumably numcmp) has type int (*)(const char *, const char *).

Casting to a different function pointer type and subsequently calling the function using the casted type is undefined behavior in standard C, however it was probably allowed in K&R C.

dbush
  • 205,898
  • 23
  • 218
  • 273
6

That particular piece of code from K&R (Brian W Kernighan and Dennis M Ritchie — The C Programming Language, 2nd Edn 1988) does not conform to standard C and isn't the way you should write code nowadays.

That's a serious charge — but the C11 standard says (§6.3 Conversions, and specifically in §6.3.2.3 Pointers ¶8:

A pointer to a function of one type may be converted to a pointer to a function of another type and back again; the result shall compare equal to the original pointer. If a converted pointer is used to call a function whose type is not compatible with the referenced type, the behavior is undefined.

In the book, the qsort() function has the signature:

void qsort(void *v[], int left, int right, int (*comp)(void *, void *));

This is different from the interface to the standard C qsort(), and is treading on thin ice — you should avoid redefining standard C functions, especially if the redefinition has a different interface. Ideally, the function would be renamed, maybe to quicksort().

The function pointers passed as comp should, therefore, match the signature

int (*comp)(void *, void *);

But the two functions have the signatures:

int numcmp(char *s1, char *s2);    /* Treating strings as numbers */
int strcmp(char *s1, char *s2);

The code inside qsort() treats the function pointers as int (*comp)(void *, void *) and the code therefore has undefined behaviour according to §6.3.2.3 ¶8. The cast to the common type is ugly but necessary to meet the requirements of the call to qsort(), and could be undone by a cast back to the original type in the called function (qsort()), but it does not cast the function pointer.

To meet the requirements of modern C, the numcmp() function would need to be rewritten:

int numcmp(void *v1, void *v2)
{
    double v1 = atof((char *)v1);
    double v2 = atof((char *)v2);
    if (v1 < v2)
        return -1;
    else if (v1 > v2)
        return +1;
    else
        return 0;
}

And a revised strcmp() would be rewritten similarly — and renamed to avoid conflicts with the Standard C implementation of strcmp(). The call to qsort() then does not need the casts on the comparator argument. The cast on the array pointer argument is still needed because char ** does not convert automatically to void ** (but does convert to void *).

Now, this is the theoretical viewpoint — backed up by the C Standard. However, in practice, you will often get away with such abuse (but the compiler will likely generate warnings about the abuse).

In Standard C, the qsort() function has the prototype:

void qsort(void *base, size_t nmemb, size_t size, int (*compar)(const void *, const void *));

This is different in several ways, notably that the comparator is passed const pointers, and the array is void * not void ** as in the book.

The numcmp() function above would have to be rewritten. If you are sorting an array of strings (char **) which is what the book is doing, then the arguments are of type char ** cast to void *.

Here is some code which sorts an array of lines, using POSIX getline() to read the lines. This code is available in my SOQ (Stack Overflow Questions) repository on GitHub as file sortlines2.c in the src/sortfile sub-directory; there is also a sortlines3.c which is more memory efficient. This code has getline() allocate new memory for each line, which it does quite happily, but it allocates a minimum size which is usually far larger than the lines you're reading.

/*
** Originally a demonstration that a comparator in an SO answer was incorrect.
** Now a simple demonstration of reading a file and sorting.
*/

#include "posixver.h"
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

/* Correct comparator */
static int compare(const void *p_lhs, const void *p_rhs)
{
    const char *lhs = *(char **)p_lhs;
    const char *rhs = *(char **)p_rhs;
    return strcmp(lhs, rhs);
}

/* Lines in array are terminated by a newline */
static void dump_array(const char *tag, size_t size, char *array[size])
{
    printf("%s:\n", tag);
    for (size_t i = 0; i < size; i++)
        printf("%.2zu: %s", i, array[i]);
}

int main(void)
{
    char **ptrs = 0;
    size_t numptrs = 0;
    char  *buffer = 0;
    size_t buflen = 0;
    size_t count = 0;

    while (getline(&buffer, &buflen, stdin) != -1)
    {
        if (count == numptrs)
        {
            size_t newnum = (numptrs + 2) * 2;
            void *newptrs = realloc(ptrs, newnum * sizeof(*ptrs));
            if (newptrs == 0)
            {
                fprintf(stderr, "Out of memory (%zu bytes requested)\n", newnum * sizeof(*ptrs));
                exit(1);
            }
            ptrs = newptrs;
            numptrs = newnum;
        }
        ptrs[count++] = buffer;
        buffer = 0;
        buflen = 0;
    }

    free(buffer);

    dump_array("Before sorting", count, ptrs);
    qsort(ptrs, count, sizeof(char *), compare);
    dump_array("After sort", count, ptrs);

    for (size_t i = 0; i < count; i++)
        free(ptrs[i]);
    free(ptrs);

    /* Avoid an 'in use at exit (reachable)' record in valgrind */
    /* Mac OS X El Capitan 10.11.4, Valgrind 3.12.0.SVN */
    /* macOS High Sierra 10.13.4, Valgrind 3.14.0.GIT */
    fclose(stdin);

    return 0;
}

Note that there are no casts on the arguments to qsort() — they should be unnecessary.

To write a numeric comparator, you'd need:

int numcmp(const void *p1, const void *p2)
{
    double v1 = atof(*(char **)p1);
    double v2 = atof(*(char **)p2);
    if (v1 < v2)
        return -1;
    else if (v1 > v2)
        return +1;
    else
        return 0;
}

And then simply pass numcmp (without a cast) to qsort().

Jonathan Leffler
  • 730,956
  • 141
  • 904
  • 1,278
  • Re “The cast on the array pointer argument is still needed because `char **` does not convert automatically to `void **` (but does convert to `void *`)”: Even with the cast, there are still aliasing violations and other issues, even given that `char *` and `void *` must have the same representation and the same alignment requirements. – Eric Postpischil Dec 23 '20 at 14:45
3

The qsort call uses 2 C idiomatic points:

  • a function can be implicitely converted into a pointer to itself when used as a value. So here (numeric ? numcmp : strcmp) is actually (numeric ? &numcmp : &strcmp)
  • a pointer to a function of any type can be casted into a pointer to a function of another type. It is only UB to call the function later with a wrong argument list.

So here we have:

  • if numeric is True, then numcmp is used as the comparison function and &numcmp is casted into a pointer to a function taking 2 void * and returning an int
  • if numeric is False, then strcmp is used as the comparison function and &strcmp is casted into a pointer to a function taking 2 void * and returning an int

For the first question, lineptr is converted to a pointer to pointer to void because it is the type expected by qsort. Here the rule is that a pointer to object can be casted to a pointer to another type (of objects) and it will get its original value when converted back to its real type. It is only UB to dereference it with a wrong type.

Serge Ballesta
  • 143,923
  • 11
  • 122
  • 252
  • Is it wrong to write the same piece of code like this: qsort((void **) lineptr, 0, nlines-1, (int (*)(numeric ? numcmp : strcmp)(void*, void*))); If so, could you tell me why? – Damian Kowalski Dec 23 '20 at 14:05
0

Why isn't it done this way then:

qsort((void *) lineptr, 0, nlines-1, (int (numeric ? &numcmp : &strcmp)(void, void*)));


(void *) lineptr

qsort needs to work on an array of some type. It can't be an array of void. There's no such thing, and voids can't be compared. For the function to be generic, this array is necessarily an array of void pointers. So void *lineptr[] aka void **lineptr is required here: A pointer to the first of many pointers.


(int (numeric ? &numcmp : &strcmp)(void, void*))

Well, that's doesn't compile.


numeric ? &numcmp : &strcmp

You know how an array used as a pointer degenerates into a pointer to its first element? Well, a function used as a pointer degenerates into a pointer to the function. So numeric ? numcmp : strcmp is equivalent to numeric ? &numcmp : &strcmp when used as an argument.

ikegami
  • 367,544
  • 15
  • 269
  • 518