3

I observed something in a log file that I cannot explain:

All code in project is ANSI C, 32bit exe running on Windows 7 64bit

I have a worker function similar to this one, running in a single threaded program, using no recursion. During debugging logging was included as shown:

//This function is called from an event handler
//triggered by a UI timer similar in concept to 
//C# `Timer.OnTick` or C++ Timer::OnTick
//with tick period set to a shorter duration
//than this worker function sometimes requires
int LoadState(int state)
{
    WriteToLog("Entering ->");  //first call in
    //...
    //Some additional code - varies in execution time, but typically ~100ms.
    //...
    WriteToLog("Leaving  <-");//second to last call out

    return 0;
}  

The function above is simplified from our actual code but is sufficient for illustrating the issue.

Occasionally we have seen log entries such as this:

enter image description here

Where the time/date stamp is on the left, then message, the last field is duration in clock() ticks between calls to logging function. This logging indicates that the function was entered twice in a row before exiting.

Without recursion, and in a single threaded program, how is it (or is it) possible that execution flow can enter a function twice before the first call was completed?

EDIT: (to show top call of logging function)

int WriteToLog(char* str)
{
    FILE* log;
    char *tmStr;
    ssize_t size;
    char pn[MAX_PATHNAME_LEN];
    char path[MAX_PATHNAME_LEN], base[50], ext[5];
    char LocationKeep[MAX_PATHNAME_LEN];
    static unsigned long long index = 0;

    if(str)
    {
        if(FileExists(LOGFILE, &size))
        {
            strcpy(pn,LOGFILE);
            ManageLogs(pn, LOGSIZE);
            tmStr = calloc(25, sizeof(char));
            log = fopen(LOGFILE, "a+");
            if (log == NULL)
            {
                free(tmStr);
                return -1;
            }
            //fprintf(log, "%10llu %s: %s - %d\n", index++, GetTimeString(tmStr), str, GetClockCycles());
            fprintf(log, "%s: %s - %d\n", GetTimeString(tmStr), str, GetClockCycles());
            //fprintf(log, "%s: %s\n",  GetTimeString(tmStr), str);
            fclose(log);
            free(tmStr);
        }
        else
        {
            strcpy(LocationKeep, LOGFILE);
            GetFileParts(LocationKeep, path, base, ext);
            CheckAndOrCreateDirectories(path);
            tmStr = calloc(25, sizeof(char));
            log = fopen(LOGFILE, "a+");
            if (log == NULL)
            {
                free(tmStr);
                return -1;
            }
            fprintf(log, "%s: %s - %d\n", GetTimeString(tmStr), str, GetClockCycles());
            //fprintf(log, "%s: %s\n",  GetTimeString(tmStr), str);
            fclose(log);
            free(tmStr);
        }
    }
    return 0;
}
ryyker
  • 22,849
  • 3
  • 43
  • 87
  • 4
    How did you confirm there is only one thread? In particular, how do you know the UI timer is not creating a separate context to execute the callback? – jxh Apr 21 '14 at 19:58
  • @jxh - In the environment I use there are UI timers that by definition are run in the primary thread. There are other options, i.e. an AsyncTimer that does create its own thread, but in this case, I am using only the UI timer. – ryyker Apr 21 '14 at 20:00
  • On quick executing code, sometimes the output streams can come out of order. Try fflush(stdout) (or whereever you are sending your output) after each write to see what is really happening. – Marshall Tigerus Apr 21 '14 at 20:00
  • @MarshallTigerus Can you elaborate, I have Never seen that with a single thread. – this Apr 21 '14 at 20:01
  • Does the "some additional code" include calls that may install new timers? – jxh Apr 21 '14 at 20:01
  • @MarshallTigerus - `fflush()` is not a factor here. logging is done sequentially, all in a single thread. I cannot imagine there should be out of order writes to the file. Each write is a `fopen()/fclose()` call – ryyker Apr 21 '14 at 20:03
  • @self. I've seen it happen previously while using single-threaded unit tests. I had a method that I would call that would print some specific data, including timestamps and state information. There was no multithreading involved but it was all lightning fast code. Things would print out of order unless I used fflush in the right spot. Unfortunately I don't have the code on hand. – Marshall Tigerus Apr 21 '14 at 20:03
  • 2
    If the arrows are hardcoded, I don't understand who you can get `Entering ->` and `Entering <-` – Thomas Ayoub Apr 21 '14 at 20:03
  • @ꜱᴀᴍᴏᴛʜ: yes, that seems to be a big red flag. The only ones that appear to be out of order are also ones that don't match the shown code. – Oliver Charlesworth Apr 21 '14 at 20:04
  • @MarshallTigerus You either had UB or a weird custom buffer. I don't see how lightning fast code(?) has anything to do with it. – this Apr 21 '14 at 20:04
  • @jxh - No. First thing we checked. this UI timer in fact is a single object residing on the user-interface itself. Easy to check for us. – ryyker Apr 21 '14 at 20:04
  • @ryyker Use some print debugging to do further testing. Print the thread ID with the line to verify it is one line. Maybe throw in a counter somewhere to see which execution it is (not sure what the numbers at the end of your print statements represent) – Marshall Tigerus Apr 21 '14 at 20:05
  • @self. Not sure, it was probably a weird server (wasn't mine). – Marshall Tigerus Apr 21 '14 at 20:06
  • @SAMOTH - The log file is edited to simplify. (Missed the arrow). But the actual logfile follows the same double in, then double out. – ryyker Apr 21 '14 at 20:07
  • @OliCharlesworth - see response to SAMOTH. Logfile is edited to simplify question. – ryyker Apr 21 '14 at 20:09
  • 3
    @ryyker: Ok, but at this point there's not enough evidence to be able to help, AFAICS. If the code is truly single-threaded, and the log-function is truly sane, and there's truly no other piece of code that can be outputting to the log, then obviously this can't happen (UB notwithstanding). So at best, we can only speculate. You'll need to produce a [minimal test-case](http://stackoverflow.com/help/mcve), I think. – Oliver Charlesworth Apr 21 '14 at 20:09
  • @OliCharlesworth - I have never seen this behavior before. And there is a lot more to the executable (and the log file) than shown here. It is single threaded, and the actual log file clearly shows several entries like this. I agree with several commenters here that there has to be something missed. I will look again. – ryyker Apr 21 '14 at 20:13
  • 2
    The MS Win logger, invoked by calling 'WriteToLog()', has several modes. If the EVENT_TRACE_FILE_MODE_CIRCULAR mode is implemented, the MS documentation states "Note that the contents of the circular log file may appear out of order on multiprocessor computers." Also, the MS Doc states about the "EVENT_TRACE_NO_PER_PROCESSOR_BUFFERING" mode "Using this mode can eliminate the issue of events appearing out of order when events are being published on different processors using system time." – Mahonri Moriancumer Apr 21 '14 at 20:17
  • @MahonriMoriancumer: Nice one, this is probably the root cause, if ryyker confirms this is the MS Win logger. If so, he should have put that information in the question! This is probably related to MarshallTigerus' observation as well. – jxh Apr 21 '14 at 20:18
  • A cheap experiment to test re-entrantcy: add a counter `static char count[] = "0"; count[0]++; WriteToLog(count);` body_of_function `count[0]--;` (Note: Not foolproof) – chux - Reinstate Monica Apr 21 '14 at 20:25
  • @MahonriMoriancumer - The WriteToLog function is my own function (will edit my question with the top call). It is a simple `fopen()`-`fclose()` function with time/date stamp, also provided by me. But I will certainly look up and read about EVENT_TRACE_NO_PER_PROCESSOR_BUFFERING, et. al. to see if it sheds light. Thanks. – ryyker Apr 21 '14 at 20:28
  • @chux - Will try something like this. Thanks. I also suspect the UI-Timer provided by my environment. I plan to post this same question on a forum to that provider to see if even though their documentation is adamant this is a primary thread only timer, that possibly it launches another thread to provide timing services to UI in main thread, and sometimes looses control of sequence. – ryyker Apr 21 '14 at 20:33
  • @ryyker: "Primary thread only" can mean multiple things. The library could still possibly use on POSIX, or Fibers on Windows. – jxh Apr 21 '14 at 20:41
  • @jxh - I will read about _Fibers_, thank you. Given the adamant response by you, et. al. I think my question has been answered, i.e. that given the listed conditions, double entry into function is not possible. I therefore am starting to suspect my environment supplier's UI Timer. Will post a question to them to see if indeed there is a possibility that this could be running in a diff thread. ***By the way***, regarding your question: _if ryyker confirms this is the MS Win logger_. I do not think so. See my question edit with code for logger. Pretty much fopen()/fclose, nothing special. – ryyker Apr 21 '14 at 20:47
  • @MarshallTigerus - I would appreciate seeing a post if you are able to find that code snippet exhibiting similar behavior. Thanks for the input. – ryyker Apr 21 '14 at 20:55
  • @ryyker If I can find it I will post it – Marshall Tigerus Apr 21 '14 at 20:56

1 Answers1

1

I asked the question wondering at the time if there was some obscure part of the C standard that allowed execution flow to enter a function more than once without first exiting (given that multi-threading or recursion was not present)

Your comments, I believe, have clearly answered the question. Borrowing from what @Oli Charlesworth said in one comment, he summarizes it up pretty well:

If the code is truly single-threaded, and the log-function is truly sane, and there's truly no other piece of code that can be outputting to the log, then obviously this can't happen (UB notwithstanding).

But since the actual log files (which I could not post for proprietary reasons) on several occasions have demonstrated this pattern, one of the conditions @Oli Charlesworth listed is not actually true for our software. My best guess at this point, given that the logging function is sane and is the only input to the file, is to consider the alternate context/Fiber possibility suggested by @jxh:

"Primary thread only" can mean multiple things. The library could still possibly use <ucontext.h> on POSIX, or Fibers on Windows.

So, I will post this same question to the supplier of my environment, specifically if their UI Timers are run in such a way as to allow parallel calls due to a fiber or thread.

If any are interested, I will also update this answer with their response.

Edit to show conclusion:

As it turns out, the cause of the double entry of execution flow into a function was implicit recursion. That is, while the worker function did not reference itself explicitly, it was designated as the event handler for two separate event generators. That, coupled with a call to Process System Events (a function available in our environment forcing events in the queue to be processed now) can (and did) result in recursive execution flow into the event handler function. Here is a quote from a person who has expertise in the relationship between UI timers and system events in our environment:

"Timer events being nested" does equate to execution flow entering a function twice before leaving. Basically, it's the same thing as basic recursion: while you're inside one function, you call that same function. The only difference between this case and basic recursion is that the recursion call is implicit (via ProcessSystemEvents) and not explicit. But the end result is the same."

Community
  • 1
  • 1
ryyker
  • 22,849
  • 3
  • 43
  • 87
  • It's nice that you followed up your question with the answer from the perspective of how the library was implemented. However, please note that since your problem statement in the question (and you mention it in your first paragraph of this answer) specifically ruled out recursion, so no one would give you that as the answer. – jxh May 24 '14 at 04:22
  • 1
    @jxh - It was not my intent to specifically rule out recursion. I knew at the time that I did not use any _obvious_ recursive syntax in my code. Also, at the time I posted this, I had no idea that what was actually happening _could_ happen. The concept of _implicit recursion_ was presented to me for the first time when someone explained how the timer callback implementation in our libraries worked. You (and Oli) were the first to make statements addressing what turned out to be the actual issue in your comments, and I referenced these in my post. – ryyker May 28 '14 at 19:27