6

I'm designing a scheduling algorithm that has the following features:

  • Have 2 user-threads (contexts) in the one process (I'm supposed to do 3 threads but that didn't work on osx yet, so I decided to make 2 work for now)
  • preemptive using a SIGALRM signal that goes off every 1 sec and changes the control from one context to another, and save the current state (registers and current position) of the context that was running before doing the switch.

what I have noticed is the following:

  • ucontext.h library behaves strange on mac osx whereas when it is applied in Linux it behaves exactly the way it is supposed to (the example from this man link: http://man7.org/linux/man-pages/man3/makecontext.3.html works perfectly as it is supposed to on linux whereas on mac it fails with Segmentation fault before it does any swapping). I have to make it run on osx unfortunately and not linux.
  • I managed to work around the swapcontext error on osx by using getcontext() & then setcontext() to do the swapping of contexts.
  • In my signal handler function, I use the sa_sigaction( int sig, siginfo_t *s, void * cntxt ) since the 3rd variable once re-casted it as a ucontext_t pointer is the information about the context that was interrupted (which is true on Linux once I tested it) but on mac it doesn't point to the proper location as when I use it I get a segmentation fault yet again.

i have designed my test functions for each context to be looping inside a while loop as I want to interrupt them and make sure they go back to execute at the proper location within that function. i have defined a static global count variable that helps me see whether I was in the proper user-thread or not.

One last note is that I found out that calling getcontext() inside my while loop with in the test functions updates the position of my current context constantly since it is am empty while loop and therefore calling setcontext() when that context's time comes makes it execute from proper place. This solution is redundant since these functions will be provided from outside the API.

    #include <stdio.h>
    #include <sys/ucontext.h>
    #include <string.h>
    #include <stdlib.h>
    #include <stdint.h>
    #include <stdbool.h>
    #include <errno.h>

    /*****************************************************************************/
    /*                            time-utility                                   */
    /*****************************************************************************/

    #include <sys/time.h> // struct timeval

    void timeval_add_s( struct timeval *tv, uint64_t s ) {
        tv->tv_sec += s;
    }

    void timeval_diff( struct timeval *c, struct timeval *a, struct timeval *b ) {

        // use signed variables
        long aa;
        long bb;
        long cc;

        aa = a->tv_sec;
        bb = b->tv_sec;
        cc = aa - bb;
        cc = cc < 0 ? -cc : cc;
        c->tv_sec = cc;

        aa = a->tv_usec;
        bb = b->tv_usec;
        cc = aa - bb;
        cc = cc < 0 ? -cc : cc;
        c->tv_usec = cc;

    out:
        return;
    }

    /******************************************************************************/
    /*                              Variables                                    */
    /*****************************************************************************/
    static int count;

    /* For now only the T1 & T2 are used */
    static ucontext_t T1, T2, T3, Main, Main_2;
    ucontext_t *ready_queue[ 4 ] = { &T1, &T2, &T3, &Main_2 };

    static int thread_count;
    static int current_thread;

    /* timer struct */
    static struct itimerval a;
    static struct timeval now, then;

    /* SIGALRM struct */
    static struct sigaction sa;

    #define USER_THREAD_SWICTH_TIME 1

    static int check;

    /******************************************************************************/
    /*                                 signals                                    */
    /*****************************************************************************/

    void handle_schedule( int sig, siginfo_t *s, void * cntxt ) {
        ucontext_t * temp_current = (ucontext_t *) cntxt;

        if( check == 0 ) {
            check = 1;
            printf("We were in main context user-thread\n");
        } else {
            ready_queue[ current_thread - 1 ] = temp_current;
            printf("We were in User-Thread # %d\n", count );
        }

        if( current_thread == thread_count ) {
            current_thread = 0;
        }
        printf("---------------------------X---------------------------\n");

        setcontext( ready_queue[ current_thread++ ] );

    out:
        return;
    }

    /* initializes the signal handler for SIGALARM, sets all the values for the alarm */
    static void start_init( void ) {
        int r;

        sa.sa_sigaction = handle_schedule;
        sigemptyset( &sa.sa_mask );
        sa.sa_flags = SA_SIGINFO;

        r = sigaction( SIGALRM, &sa, NULL );
        if( r == -1 ) {
            printf("Error: cannot handle SIGALARM\n");
            goto out;
        }

        gettimeofday( &now, NULL );
        timeval_diff( &( a.it_value ), &now, &then );

        timeval_add_s( &( a.it_interval ), USER_THREAD_SWICTH_TIME );
        setitimer( ITIMER_REAL, &a, NULL );

    out:
        return;
    }

    /******************************************************************************/
    /*                      Thread Init                                           */
    /*****************************************************************************/

    static void thread_create( void * task_func(void), int arg_num, int task_arg ) {
        ucontext_t* thread_temp = ready_queue[ thread_count ];

        getcontext( thread_temp );

        thread_temp->uc_link = NULL;
        thread_temp->uc_stack.ss_size = SIGSTKSZ;
        thread_temp->uc_stack.ss_sp = malloc( SIGSTKSZ );
        thread_temp->uc_stack.ss_flags = 0;

        if( arg_num == 0 ) {
            makecontext( thread_temp, task_func, arg_num );
        } else {
            makecontext( thread_temp, task_func, arg_num, task_arg );
        }

        thread_count++;

    out:
        return;
    }

    /******************************************************************************/
    /*                            Testing Functions                               */
    /*****************************************************************************/

    void thread_funct( int i ) {

        printf( "---------------------------------This is User-Thread #%d--------------------------------\n", i );
        while(1) { count = i;} //getcontext( ready_queue[ 0 ] );}

    out:
        return;
    }

    void thread_funct_2( int i ) {
        printf( "---------------------------------This is User-Thread #%d--------------------------------\n", i );
        while(1) { count = i;} //getcontext( ready_queue[ 1 ] ); }

    out:
        return;
    }

    /******************************************************************************/
    /*                               Main Functions                               */
    /*****************************************************************************/

    int main( void ) {
        int r;
        gettimeofday( &then, NULL );

        thread_create( (void *)thread_funct, 1, 1);
        thread_create( (void *)thread_funct_2, 1, 2);

        start_init();

        while(1);

        printf( "completed\n" );

    out:
        return 0;
    }
  • What am I doing wrong here? I have to change this around a bit to run it on Linux properly & running the version that works on Linux on OSX causes segmentation fault, but why would it work on that OS and not this?
  • Is this related by any chance to my stack size i allocate in each context?
  • Am I supposed to have a stack space allocated for my signal? (It says that if I don't then it uses a default stack, and if I do it doesn't really make a difference)?
  • If the use of ucontext will never give predictable behavior on mac osx, then what is the alternative to implement user-threading on osx? I tried using tmrjump & longjmp but I run into the same issue which is when a context is interrupted in the middle of executing certain function then how can I get the exact position of where that context got interrupted in order to continue where I left off next time?
JJ Adams
  • 481
  • 4
  • 17
  • OS X is based on BSD UNIX; Linux is an independent implementation. Instead of relying on Linux manual pages for your behavior expectations, you should be looking at the [POSIX docs](http://pubs.opengroup.org/onlinepubs/009695399/functions/setcontext.html), with which both implementations ought to comply. – John Bollinger Nov 13 '15 at 16:44
  • Alternatively, if you care only about OS X then [OS X-specific docs](https://developer.apple.com/library/mac/documentation/Darwin/Reference/ManPages/man3/getcontext.3.html) would be reasonable. – John Bollinger Nov 13 '15 at 16:54
  • It's suspicious that your thread functions call `getcontext()`. They should not have to do that, and it is possible that doing it will in fact cause problems. – John Bollinger Nov 13 '15 at 17:08
  • To the best of my knowledge and research ability, there is no way to implement user-context thread preemption whose correct behavior is ensured by current standards. A signal-based approach such as you are trying to implement is the most likely candidate, but the required behavior (resuming the context passed as the third argument to a signal handler) these days has unspecified behavior. Indeed, the whole user context framework is these days obsolescent. – John Bollinger Nov 13 '15 at 17:41
  • @JohnBollinger It shouldn't be calling getcontext() in my thread function. I just edited the code and fixed it. I had that in when I was testing earlier. This is the implementation that causes the seg fault. Also Using Linux manual, POSIX manual, or OSX-specific code doesn't make a diffirence as they all provide almost same information about ucontext.h – JJ Adams Nov 13 '15 at 18:30
  • Note that although it's probably not an issue for your test program, signal handlers can safely call only a smallish subset of the standard library functions -- those marked "async-signal safe". There is a list in [the POSIX specs](http://pubs.opengroup.org/onlinepubs/009695399/functions/xsh_chap02_04.html), and it does not include `printf()`. Perhaps more importantly, it does not include `setcontext()`, though I guess that's to be expected. – John Bollinger Nov 13 '15 at 18:59
  • Also, I find that your code consistently segfaults for me in Linux / kernel 2.6.32 / glibc 2.12, always after the sixth context switch. Gdb tells me that the segfault happens in `setcontext()`. Valgrind reports a bunch of invalid accesses, which I am looking at, but ultimately the program seems not to segfault when running under valgrind – John Bollinger Nov 13 '15 at 19:07
  • @JohnBollinger regarding the 1st comment: i have tried changing the signal handler function by moving all the code to another function and making its job only assigning a static global flag to a value of 1 which inturns causes our first function to execute, do all necessary checks, reset the global flag to 0, and then setcontext to next context. Still no changes interms of the sig fault. – JJ Adams Nov 13 '15 at 19:24
  • @JohnBollinger regarding your last comment: I do find that this code seg faults on Linux as well which is why I make some changes to it and it runs perfectly, but even that modified code doesn't run on OSX again. I find this behavior so strange. – JJ Adams Nov 13 '15 at 19:27
  • I'm afraid I don't have an answer for you, but I do note that it is not correct to convert a function pointer to an object pointer or the other way around, as your code does. Also, as currently written, your model cannot support threads exiting. The first thread that exits will terminate the program. Overall, although user contexts can be used for cooperative multithreading, they aren't safe for preemptive multithreading, on account of `setcontext()` and `swapcontext()` not being async signal safe. Solving that would require special coding of the thread functions. – John Bollinger Nov 13 '15 at 23:13
  • @JohnBollinger where am I converting a function pointer to an object pointer? Also i do realize that when 1st thread exists then the program terminates and i have the solution for that. Just waiting to get this running before integrating it in. Thank you for your help though – JJ Adams Nov 14 '15 at 00:28
  • This is such a pointer conversion: `(void *)thread_funct`. – John Bollinger Nov 14 '15 at 03:19

1 Answers1

7

So after days of testing and debugging I finally got this. I had to dig deep into the implementation of the ucontext.h and found differences between the 2 OS. Turns out that OSX implementation of ucontext.h is different from that of Linux. For instance the mcontext_t struct within ucontext_t struct which n=usually holds the values of the registers (PI, SP, BP, general registers...) of each context is declared as a pointer in OSX whereas on Linux it is not. A couple of other differences that needed top be set specially the context's stack pointer (rsp) register, the base pointer (rbp) register, the instruction pointer (rip) register, the destination index (rdi) register... All these had to be set correctly at the beginining/creation of each context as well as after it returns for the first time. I also had top create a mcontext struct to hold these registers and have my ucontext_t struct's uc_mcontext pointer point to it. After all that was done I was able to use the ucontext_t pointer that was passed as an argument in the sa_sigaction signal handler function (after I recast it to ucontext_t) in order to resume exactly where the context left off last time. Bottom line it was a messy affair. Anyone interested in more details can msg me. JJ out.

JJ Adams
  • 481
  • 4
  • 17