-1

I'm working on a C project using Pthread that needs to share some variables. There are several lines of code written yet and I just realized that using shared global variables doesn't work quite well because of the cache system.

I find on stackoverflow that a solution is to pass the adress of the variable (in my case it's more than one) to the thread function, what does it change?

Since my thread functions call other functions who will modify the globals, it's a bit painful to add a parameter to the chain of called functions where one function modify the globals.

So I was wondering, would it work to declare global pointers for each globals and use them to acess the global instead of the real globals?

I think it's a superficial inderiction but why wouldn't it work after all?

My program is an UDP network protocol whre networks look like rings or circled simple linked list. Instances of the program on the network are called entities.

An entity entity can insert on a ring or ask an entity to create another ring (dupplication), so the other entity would be on two ring.

The interface is sort of a shell where commands can leads to sending messages on the ring. Messages circle all over the rings after being stopped when they have ever been seen.

The shell is in the main thread, there is a thread for message treatment, another to manage insertion, and there is also a thread to detect broken rings. The problems is located in the ring tester thread. The thread initialize a global array (volatile short ring_check[NRING]) of size of the maximum ring numbers for an entity, initialize the first element with 0 according to the actual number of rings and the rest with -1, after that it send a test message in each ring and sleeps during a timeout of 30sec. When the timeout has finished, it checks for the values in the array.

The values are changed by the thread for message treatment, when a test message went went back, it detects it by its content and write -1 to the appropriate ring_check element.

The problem is that after a dupplication, the two rings are tested but the checking for the second failed (ring_check[1] == 0) and I really don't know why... The test message is received, immediately after the sending, after the message treatment modifies ring_check[1] to 0 I print it to see if the change is really made and it prints 1. But about 20 to 30sec later, the ring_tester wake up from his sleeping time and it reads 0 in ring_check[1].

short volatile ring_check[NRING+1];

// The function in the ring tester thread
static void test_ring() {
    // initialize ring_check array
    debug("ring_tester", GREEN "setting ring_check to -1...");
    char port_diff[5];
    // send test messages in each rings
    int fixed_nring = getnring();
    for (int i = fixed_nring+1; i < NRING; ++i) {
        ring_check[i] = -1;
    }
    for (int i = 0; i < fixed_nring + 1; i++) {
        debug("ring_tester", GREEN "setting ring_check %d to 0...", i);
        ring_check[i] = 0;
        itoa4(port_diff, ent.mdiff_port[i]);
        debug("ring_tester", GREEN "sending test to ring %d...", i);
        sendmessage(i, "TEST", "%s %s", ent.mdiff_ip[i], port_diff);
    }
    debug("test_ring", GREEN "timeout beginning...");
    sleep(timeout);
    debug("test_ring", GREEN "end of timeout.");

    for (int i = 0; i < fixed_nring + 1 && ring_check[i] != -1; i++) {
        debug("test_ring", GREEN "ring_check[%d]:%d", i, ring_check[i]);
        if (ring_check[i]) {
            debug("test_ring", GREEN "ring %d: checked.", i);
            continue;
        }
        else {
            debug("test_ring", GREEN "ring %d: checking failed. Ring broken...", i);
            continue;
        }
    }


 // The function called by the message treatment thread

static int action_test(char *message, char *content, int lookup_flag) {
    debug("action_test", RED "entering function...");
    if (content[15] != ' ' || content[20] != 0) {
        debug("action_test", RED "content not following the protocol."\
                "content: \"%s\"", content);
        return 1;
    }
    if (lookup_flag) {
        char mdiff_port[5];
        int fixed_nring = getnring();
        for (int i = 0; i < fixed_nring + 1 && ring_check[i] != -1; ++i) {
            itoa4(mdiff_port, ent.mdiff_port[i]);
            // find ring associated with message and actualize the checking
            if (strncmp(content, ent.mdiff_ip[i], 15) == 0 &&
                    strncmp(&content[16], mdiff_port, 4) == 0 &&
                    ring_check[i] != -1) {
                ring_check[i] = 1;
                debug("action_test", 
                        RED "correspondance found, ring_check[%d]:%d", i, ring_check[i]);
                return 0;
            }
        }
    }
    else {
        sendpacket_all(message);
    }
    return 0;
}
  • 1
    What do you mean by "using shared global variables doesn't work quite well because of the cache system.", and how is this different for pointers to shared objects? – EOF Apr 13 '16 at 19:18
  • I was talking about when you don't use the `volatile` keyword (http://stackoverflow.com/questions/78172/using-c-pthreads-do-shared-variables-need-to-be-volatile), but I was confused when I said that because the variables are declared `volatile`. The problem appears on a staticly allocated array `volatile short ring_check[10]` which is modified by a thread but another thread reads it and doesn't see the changes made. – Nicolas Scotto Di Perto Apr 13 '16 at 19:27
  • 1
    \*sigh\*. `volatile` is a red herring. The relevant thing is C11 draft standard n1570, *5.1.2.4 Multi-threaded executions and data races, 4 Two expression evaluations conflict if one of them modifies a memory location and the other one reads or modifies the same memory location.[...] 25 The execution of a program contains a data race if it contains two conflicting actions in different threads, at least one of which is not atomic, and neither happens before the other. Any such data race results in undefined behavior.* This affects *all* shared object, whether they be global or not. – EOF Apr 13 '16 at 19:30
  • There's a 30sec delay between the writing and the reading so there's no race in that case. I edit the post tom make it clearer... – Nicolas Scotto Di Perto Apr 13 '16 at 19:34
  • @NicolasScottoDiPerto If that 30 seconds occurs in a way that is guaranteed to order the operations, then you are fine. But if it just "happens to happen", then you are not fine. Because this is such a hard thing to understand, why not just use a mutex and not worry about it? – David Schwartz Apr 13 '16 at 19:49
  • Other threads needs to modify the variable during the timeout, I tried to explain the program in the edited post. – Nicolas Scotto Di Perto Apr 13 '16 at 19:55
  • That's a lot of text containing about none of the pertinent information needed to decide how to prevent race-conditions in your program. – EOF Apr 13 '16 at 20:01
  • Sorry but the program is big and I didn't know what to put. I have extracted the essential now. – Nicolas Scotto Di Perto Apr 13 '16 at 20:10
  • Still not quite sure what you're asking. `pthread_create` only takes one parameter as the thread's function parameter, so if you need more than one in each thread's function you will need to pass them in a struct. If you're asking about UB, `ring_check` is clearly operated on in both thread functions without protection. If you're "sure" that your mandated 30 sec sleep synchronizes the threads (you shouldn't lie to yourself), don't spawn the thread that writes 0 to it and see what reads back. It could be there is a buffer overflow somewhere else that's corrupting the data. – yano Apr 13 '16 at 21:09
  • Ok so I have just tested to declare the global `short volatile *rc = ring_check`, I use it in the threads instead of `ring_check` and it just works! That is so strange... So what happens? It can only be a problem due to a cache system, I don't know which, but I don't see any other explanation. So why would everyone pass a struct of pointers if having a global pointer works? – Nicolas Scotto Di Perto Apr 14 '16 at 09:42
  • @NicolasScottoDiPerto: I don't know who "everyone" is, but I suspect they don't use globals because globals are a mess, and they don't use shared, modifiable globals because they don't like data-races resulting in undefined behavior. – EOF Apr 14 '16 at 10:38

1 Answers1

0

You could define a global structure such as thread_inputparam. Put all the global variables' addresses in it and send to all threads, the adress of this structure variable.

int global1;
struct thread_input {
    int *g1;
   // add other globals'addresses
}thread_inputparam;
thread_inputparam.g1=&global1;
G. Emadi
  • 230
  • 1
  • 8
  • Yes it's the first idea I think of, but it would cause a lot of refactoring in my code because there's a long chain of function after the threads creations and the variables are only access at the end of the chain, so it wouldn't be elegant to add a parameter to each functions of the chain, they have nothing to do with the variables. Thread can access global variables, but they have a cache copy for each variables and it leads to concurrency inconsistency. So I wonder if having a global pointer instead would trick that mechanism. – Nicolas Scotto Di Perto Apr 13 '16 at 19:07
  • I do not understand the cache copy clearly.could you clarify this? – G. Emadi Apr 13 '16 at 19:23
  • Sorry I was confused, I thought about the problems that may occur on non-`volatile` variables, but mine are so don't take it into consideration. – Nicolas Scotto Di Perto Apr 13 '16 at 19:29
  • @NicolasScottoDiPerto The same problems can occur on `volatile` variables. The `volatile` keyword just inhibits compiler optimizations but the compiler is not the only thing that can optimize memory accesses. – David Schwartz Apr 13 '16 at 20:11