3

I'm trying to refactor a utility that is currently a stand-alone C program, such that I can make a reusable library. It includes a sorting step of an array, according to the corresponding value within a global array.

// Global lookup table
double *rating;

// Comparator using lookup
int comp_by_rating(const void *a, const void *b) {
  int * x = (int *) a;
  int * y = (int *) b;
  if (rating[*x] > rating[*y])
    return 1;
  else if (rating[*x] < rating[*y])
    return -1;
  else
    return 0;
}

int main() {
  int* myarray;
  // ...
  // initialize values of local myarray and global rating
  // ...

  qsort(myarray, length_myarray, sizeof(int), comp_by_rating);
  // ...
  return 0;
}

Is there a way for me to avoid having the rating lookup table global? I'm traditionally a C++ person so my first thought was a functor, but I have to stay in C and so I guess I'm functor-less. I also can't replace int *myarray with an array of structs holding the rating for each item, since other code requires the array in its current form. Do I have any other options?

beldaz
  • 4,299
  • 3
  • 43
  • 63

3 Answers3

3

I also can't replace int *myarray with an array of structs holding the rating for each item, since other code requires the array in its current form.

You can make a temporary replacement for sorting, call qsort, and harvest the results back into the original array:

struct rated_int {
    int n;
    double r;
};

struct rated_int *tmp = malloc(length_myarray * sizeof(struct rated_int));
for (int i = 0 ; i != length_myarray ; i++) {
    tmp[i].n = myarray[i];
    tmp[i].r = ratings[myarray[i]];
}
qsort(tmp, length_myarray, sizeof(struct rated_int), comp_struct);
for (int i = 0 ; i != length_myarray ; i++) {
    myarray[i] = tmp[i].n;
}
free(tmp);

This way the rest of the code would see myarray as an array of integers.

Sergey Kalinichenko
  • 714,442
  • 84
  • 1,110
  • 1,523
  • Yep. *This* is how you roll in C, and, incidentally, also why the argument "just write rewrite the performance critical parts in C" doesn't always work out so well in practice... – user268396 Jan 20 '16 at 10:56
  • @FUZxxl No but the general idea of making a copy, sorting the copy and then iterating over the copy using a for loop and slotting the results into place... that still works. – user268396 Jan 20 '16 at 10:58
  • @user268396 Making the sort out-of-place just so you can use `qsort` is the way you roll in C? Sorry, I disagree with that. – fuz Jan 20 '16 at 11:06
  • 1
    This is the best way to go for me, I think. Thanks. – beldaz Jan 20 '16 at 11:09
  • @FUZxxl: no, copying is the way you roll in C if you need to get out of a corner you painted yourself into like that... Doing 'clever' things is usually a recipe for worse disaster down the line. – user268396 Jan 20 '16 at 11:14
  • @user268396 Seems like you aren't sufficiently creative to find an efficient solution. – fuz Jan 20 '16 at 11:14
  • @FUZxxl Yes. Subjectively 'efficient' solutions to problems of bad program design/structure of my own making don't strike me as the right reason for getting clever and potentially introducing bugs or requiring preconditions my existing calling code isn't prepared to guarantee. Better to redo the program structure if I can afford the time/effort. – user268396 Jan 20 '16 at 11:21
1

That's how you roll in C. If you are worried about thread-safety, consider making the variable thread-local, so multiple threads have different copies of it:

static _Thread_local double *rating;

This is not supported by old compilers though, instead you need some sort of portability kludge. If you don't like this either, you can't really get around writing your own sorting routine that allows an extra parameter.

gcc provides nested functions as an extension to solve this problem, but they have other problems, namely, they require an executable stack which reduces the resilience of your program against bugs.

Community
  • 1
  • 1
fuz
  • 88,405
  • 25
  • 200
  • 352
  • Thanks for the definitive "No". For completeness, do you know of any appropriately augmented qsort libraries that can be portably used for this? – beldaz Jan 20 '16 at 11:17
  • @beldaz I don't know any, I wrote my own sort routines for this problem. Sorting isn't exactly the hardest algorithm to implement yourself. – fuz Jan 20 '16 at 11:18
  • 1
    @beldaz FreeBSD has `qsort_r` which is BSD-licensed. You can find the entry point [in the source tree](https://github.com/freebsd/freebsd/blob/master/lib/libc/stdlib/qsort_r.c). The code is a little complex though. – fuz Jan 20 '16 at 11:21
  • @FUZxxl glibc also has `qsort_r` that does the same thing. It has a different interface. The difference may or may not be detected at compile time, but will bomb at run time if you accidentally use the wrong one. *That's* how you roll in C. – n. m. could be an AI Jan 20 '16 at 12:05
  • @n.m. I'm referring to `qsort_r` as a sorting routine that does what OP needs and that can be freely copied. I'm not telling him to depend on `qsort_r` being available, I tell him to copy the code into his own project. Maybe I should have made that clear. – fuz Jan 20 '16 at 12:14
  • OK, copying and subsequently maintaining someone else's non-trivial source code, instead of calling a standard function, because the standard has ignored the issue for 25 years. That's how you roll in C. – n. m. could be an AI Jan 20 '16 at 12:52
  • @n.m. Yeah. My first suggestion was to write your own sort routine though, because then you understand the bugs it has. – fuz Jan 20 '16 at 13:00
  • @n.m. You might be interested in knowing that the glibc variant of `qsort_r` is [proposed for inclusion into POSIX](http://austingroupbugs.net/view.php?id=900). – fuz Jan 25 '16 at 22:39
0

No, the array rating does not have to be global: You can use static here:

int comp_by_rating(const void *a, const void *b) {
    static double *rating = { .. init here .. };

Alternate way to initialize it if members are non-constant:

int comp_by_rating(const void *a, const void *b) {
    static double *rating = NULL;
    if (!rating) {
        // Initialize rating here
    }
}
Ctx
  • 18,090
  • 24
  • 36
  • 51
  • That hides access to the variable, but doesn't make the variable itself any less global... – user268396 Jan 20 '16 at 10:46
  • And how is OP supposed to set `rating` arrays values? – fuz Jan 20 '16 at 10:47
  • If you want to use a constant hard-coded ratings table, that's an OK method. – n. m. could be an AI Jan 20 '16 at 10:47
  • 1
    @user268396 What?? Please, you can't seriously claim that `rating` is a global variable here. – Ctx Jan 20 '16 at 10:47
  • @FUZxxl How does he do it in the first place? – Ctx Jan 20 '16 at 10:48
  • @Ctx: it's a global variable the same way any singleton is just a fancy global variable with somewhat controlled access patterns. – user268396 Jan 20 '16 at 10:49
  • @Ctx By allocating an array with the lookup table he wants and setting `rating` to it somewhere. – fuz Jan 20 '16 at 10:49
  • @FUZxxl Ok, then he can do this at the beginning of the function. I will extend the answer by an example for that. – Ctx Jan 20 '16 at 10:51
  • Hadn't thought of this. Presumably the init code could call a setup function that initializes the state the first time around. Only problem would be that presumably I wouldn't be able to reset the rating if I were using this code multiple times in a library. – beldaz Jan 20 '16 at 10:51
  • @Ctx This still doesn't solve the problem. Now you need a global variable for `comp_by_rating` to find out what to initialize `rating` to. – fuz Jan 20 '16 at 10:52
  • @FUZxxl You do not know the source of the data. Maybe a file has to be read for it, etc. – Ctx Jan 20 '16 at 10:53
  • @Ctx You are right. Maybe OP should give us more details so we can give a better answer. – fuz Jan 20 '16 at 10:54
  • @Ctx And even then, the approach to initialize `rating` inside `comp_by_rating` has the problem of not returning the memory for `rating` after sorting is done. – fuz Jan 20 '16 at 10:55
  • @FUZxxl but it is initialized only once per process, not per sorting, so this should not be an issue. Many libraries use such an approach, this would also be the case when using a global variable (or a thread-local storage) – Ctx Jan 20 '16 at 10:56
  • @FUZxxl Current code does read data from a file, but part of the aim is to have a re-usable library that isn't tied to a particular source. I think this approach has merits in that while in some ways still a global it _feels_ less global, but I think it's not the way to go for my particular needs. +1 from me. – beldaz Jan 20 '16 at 11:04
  • 1
    @user268396 No, it is not a global variable. In this semantics, the function `comp_by_rating` is the global, and rating is local to it. What you are actually trying to convey is that any variable, which is not on the stack, is global. This just isn't the case. See also: https://en.wikipedia.org/wiki/Static_variable (the example). – Ctx Jan 20 '16 at 11:26
  • @Ctx: I know what the static keyword does. And what it doesn't. You will note that the machinery necessary to pass a different ratings array will wind up looking rather a lot like an indirection for modifying a global variable all prettied up as a singleton... – user268396 Jan 20 '16 at 11:36
  • 1
    A file-static variable is much more flexible than a function-static one, without any significant drawbacks. – n. m. could be an AI Jan 20 '16 at 11:43
  • @user268396 So in your eyes, it is a sole matter on how a memory location is accessed if a variable in C is local or global? Please take into account that this may differ from architecture to architecture. In _many_ architectures, static local variables are addressed _relative to the instruction pointer_, while global variables are adressed absolute. So, this would mean, that it is dependent on the architecture if it is global or not. You see, this definition doesn't lead anywhere sensible – Ctx Jan 20 '16 at 11:46
  • @Ctx the distinction is in how access to the variable in the program is structured. On this level the language itself isn't really all that relevant: the conceptual program structure/design is what matters. If you'd diagram this in, say, UML object diagram/collaboration diagram, it's about cardinality/arity of the *possible* relationships. – user268396 Jan 20 '16 at 11:55
  • 1
    @user268396 In matters of the language construct, global and local are terms of the variable _scope_, you use the terms just plain wrong. – Ctx Jan 20 '16 at 11:57
  • 1
    @user268396 The definition of a global variable, is a variable which is accessible globally (big surprise). That is, it is accessible everywhere in the project. Therfore, a `static` variable can never be global by definition, at worst it can have _file scope_. File scope is not global scope. Overall, you are confusing _scope_ with _storage duration_, they are entirely different things. The singleton pattern has little to do with scope, but a lot to do with storage duration. – Lundin Jan 20 '16 at 12:04
  • @Ctx C doesn't even have global scope. I've always seen the word *global variable* as meaning *variable of static storage duration.* – fuz Jan 20 '16 at 12:18