That particular piece of code from K&R (Brian W Kernighan and Dennis M Ritchie —
The C Programming Language, 2nd Edn 1988) does not conform to standard C and isn't the way you should write code nowadays.
That's a serious charge — but the C11 standard says (§6.3 Conversions, and specifically in §6.3.2.3 Pointers ¶8:
A pointer to a function of one type may be converted to a pointer to a function of another type and back again; the result shall compare equal to the original pointer. If a converted pointer is used to call a function whose type is not compatible with the referenced type, the behavior is undefined.
In the book, the qsort()
function has the signature:
void qsort(void *v[], int left, int right, int (*comp)(void *, void *));
This is different from the interface to the standard C qsort()
, and is treading on thin ice — you should avoid redefining standard C functions, especially if the redefinition has a different interface. Ideally, the function would be renamed, maybe to quicksort()
.
The function pointers passed as comp
should, therefore, match the signature
int (*comp)(void *, void *);
But the two functions have the signatures:
int numcmp(char *s1, char *s2); /* Treating strings as numbers */
int strcmp(char *s1, char *s2);
The code inside qsort()
treats the function pointers as int (*comp)(void *, void *)
and the code therefore has undefined behaviour according to §6.3.2.3 ¶8. The cast to the common type is ugly but necessary to meet the requirements of the call to qsort()
, and could be undone by a cast back to the original type in the called function (qsort()
), but it does not cast the function pointer.
To meet the requirements of modern C, the numcmp()
function would need to be rewritten:
int numcmp(void *v1, void *v2)
{
double v1 = atof((char *)v1);
double v2 = atof((char *)v2);
if (v1 < v2)
return -1;
else if (v1 > v2)
return +1;
else
return 0;
}
And a revised strcmp()
would be rewritten similarly — and renamed to avoid conflicts with the Standard C implementation of strcmp()
. The call to qsort()
then does not need the casts on the comparator argument. The cast on the array pointer argument is still needed because char **
does not convert automatically to void **
(but does convert to void *
).
Now, this is the theoretical viewpoint — backed up by the C Standard. However, in practice, you will often get away with such abuse (but the compiler will likely generate warnings about the abuse).
In Standard C, the qsort()
function has the prototype:
void qsort(void *base, size_t nmemb, size_t size, int (*compar)(const void *, const void *));
This is different in several ways, notably that the comparator is passed const
pointers, and the array is void *
not void **
as in the book.
The numcmp()
function above would have to be rewritten. If you are sorting an array of strings (char **
) which is what the book is doing, then the arguments are of type char **
cast to void *
.
Here is some code which sorts an array of lines, using POSIX getline()
to read the lines. This code is available in my SOQ (Stack Overflow Questions) repository on GitHub as file sortlines2.c
in the src/sortfile sub-directory; there is also a sortlines3.c
which is more memory efficient. This code has getline()
allocate new memory for each line, which it does quite happily, but it allocates a minimum size which is usually far larger than the lines you're reading.
/*
** Originally a demonstration that a comparator in an SO answer was incorrect.
** Now a simple demonstration of reading a file and sorting.
*/
#include "posixver.h"
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
/* Correct comparator */
static int compare(const void *p_lhs, const void *p_rhs)
{
const char *lhs = *(char **)p_lhs;
const char *rhs = *(char **)p_rhs;
return strcmp(lhs, rhs);
}
/* Lines in array are terminated by a newline */
static void dump_array(const char *tag, size_t size, char *array[size])
{
printf("%s:\n", tag);
for (size_t i = 0; i < size; i++)
printf("%.2zu: %s", i, array[i]);
}
int main(void)
{
char **ptrs = 0;
size_t numptrs = 0;
char *buffer = 0;
size_t buflen = 0;
size_t count = 0;
while (getline(&buffer, &buflen, stdin) != -1)
{
if (count == numptrs)
{
size_t newnum = (numptrs + 2) * 2;
void *newptrs = realloc(ptrs, newnum * sizeof(*ptrs));
if (newptrs == 0)
{
fprintf(stderr, "Out of memory (%zu bytes requested)\n", newnum * sizeof(*ptrs));
exit(1);
}
ptrs = newptrs;
numptrs = newnum;
}
ptrs[count++] = buffer;
buffer = 0;
buflen = 0;
}
free(buffer);
dump_array("Before sorting", count, ptrs);
qsort(ptrs, count, sizeof(char *), compare);
dump_array("After sort", count, ptrs);
for (size_t i = 0; i < count; i++)
free(ptrs[i]);
free(ptrs);
/* Avoid an 'in use at exit (reachable)' record in valgrind */
/* Mac OS X El Capitan 10.11.4, Valgrind 3.12.0.SVN */
/* macOS High Sierra 10.13.4, Valgrind 3.14.0.GIT */
fclose(stdin);
return 0;
}
Note that there are no casts on the arguments to qsort()
— they should be unnecessary.
To write a numeric comparator, you'd need:
int numcmp(const void *p1, const void *p2)
{
double v1 = atof(*(char **)p1);
double v2 = atof(*(char **)p2);
if (v1 < v2)
return -1;
else if (v1 > v2)
return +1;
else
return 0;
}
And then simply pass numcmp
(without a cast) to qsort()
.