2

I'm maintaining code written by someone just before they retired, which means I can't find them to ask questions. :-) This is basically a C++ wrapper to launch a program. The chunk of code in question is this:

BOOL bSuccess = CreateProcess(NULL, (char *)strBatFile.c_str(),
    NULL, NULL, TRUE, CREATE_NO_WINDOW, NULL, strLocalWorkingDir.c_str(), &si,  &pi );

   if( bSuccess )
   {
      DWORD dwMillisec = INFINITE;      
      DWORD dwWaitStatus = WaitForSingleObject( pi.hProcess, dwMillisec );

      if( dwWaitStatus == WAIT_OBJECT_0 )
      {
         DWORD dwExitCode = NULL;
         GetExitCodeProcess( pi.hProcess, &dwExitCode );
         nRet = (int)dwExitCode;
        }

      CloseHandle( pi.hThread );
      CloseHandle( pi.hProcess );
   }
   else
      nRet = START_PROCESS_FAILED;

If just one instance is run at a time, it always works fine. If multiple are run within a very short time frame, though, about half of them are having dwExitCode set to 1 instead of 0, even though the process isn't crashing, and the log file that internal program writes is completing.

So to clarify, the process is always starting fine, and it's always getting into the if statements, but it's the value of dwExitCode set by GetExitCodeProcess that isn't containing what's expected. Since we error check on this, we're flagging a bunch of these runs as incomplete when they in fact are fine.

Is there any way this value could be set to something different than the process exit code? And/or is there a utility I could run at the same time to confirm the exit codes are what I think they are?

Thanks!

ETA: Just realised this is putting the internal program call in a .bat file - "C:\\ --flags etc..." and then calling that as a command line in the second argument, rather than just calling it directly using lpApplicationName. No idea if this makes a difference! But when I print the PID of the process, I can see it's the PID for a cmd.exe process, and then our program has a child PID. However, when I trace in Process Monitor, I can see that both parent and child are exiting with exit code 0.

teleute00
  • 561
  • 6
  • 25
  • 2
    Are you sharing `pi` across multiple threads at the same time, by chance? Otherwise, the only ways this code can return an ExitCode of 1 is if either the spawned process really did return a value of 1, or if `nRet` is not initialized correctly (since a `dwWaitStatus` value other than `WAIT_OBJECT_0` does not update `nRet`). – Remy Lebeau Jun 01 '13 at 00:28
  • It seems very unlikely that `GetExitCodeProcess` would be broken - it is used quite regularly by Windows and third party code, both for command prompt and GUI programs. Are you sure you are passing the correct hProcess value to `GetExitCodeProcess`. – Mats Petersson Jun 01 '13 at 00:30
  • 2
    You're not checking the return value of `GetExitCodeProcess`. From the docs: "If the function succeeds, the return value is nonzero. If the function fails, the return value is zero. To get extended error information, call GetLastError.". Your code doesn't know if it succeeds or fails, because it doesn't check. – Ken White Jun 01 '13 at 00:50
  • Except for the missing checks, that code fragment looks reasonable but it is hard to give a definitive answer without looking at the launched program or knowing what it does: are you sure that it ALWAYS properly sets its exit code to 0 on success and 1 on error? There are different ways in which a process can set its exit code, listed on the MSDN `GetExitCodeProcess()` documentation. – Luis Jun 01 '13 at 03:06
  • @RemyLebeau - nRet is initialized earlier, and all looks fine on that front. I have to investigate the threading further (I'm very new to this type of coding), but what effect would that have? – teleute00 Jun 01 '13 at 03:47
  • @MatsPetersson - Oh, I'm sure it's not broken; I'm just wondering if it's being used correctly. There's only one hProcess (this exe is run once per process being launched; it's not doing multiple calls to this code) - so it should be correct. If there's a good way to double check, though, I certainly can. Still trying to figure out best way to verify. – teleute00 Jun 01 '13 at 03:49
  • @KenWhite - (Not my code! Just stuff I inherited...) Ah...so you think the times it's set to the incorrect value are actually times the function itself isn't completing correctly? I'm embarrassed to say I didn't think to check that. I had a lot of work just narrowing the issue to this portion of code, so I think I got a bit punchy. I'll check that as soon as I can get at the code again. – teleute00 Jun 01 '13 at 03:50
  • @Luis - The launched program is also one we've written internally, and I'm assured it always returns 0 unless it flat out crashes. And when I do this same thing with the previous wrapper written for this purpose, everything works fine. (The original is in vb, and also a number of features were added/changed in the rewrite, so sadly I can't compare the two wrappers directly.) But the exact same scenario being run with the same files works every single time with the old wrapper. – teleute00 Jun 01 '13 at 03:53
  • It's your code now, because you inherited it. :-) I think that when an API function has a return value, you should check it instead of assuming that it worked. We all know what happens when we assume something... :-) – Ken White Jun 01 '13 at 03:57
  • @KenWhite - More than fair. I do actually remember noticing we weren't doing anything with the return value, but I was frazzled enough by that point that I kind of dismissed it. Bad me, no biscuit. :-) – teleute00 Jun 01 '13 at 17:27
  • @KenWhite - Check added. Sadly, it's always succeeding, whether the behaviour is as desired or not, so that's not the culprit. :-( – teleute00 Jun 03 '13 at 16:02
  • Hmmm...I'm not sure if this is a valid check, but I had it print out the value of pi.hProcess to the log. Every time I run this, whether I run one instance or multiple, every time the hProcess handle shows as 00000000000000B8. (It shows 0000000000000000 just before the CreateProcess call, so it does seem to be initialized properly.) The pi.dwProcessId is different for each, though... – teleute00 Jun 03 '13 at 17:48
  • I doubt it's related but just in case: the question reminded me of this oldie: http://stackoverflow.com/questions/2061735/42-passed-to-terminateprocess-sometimes-getexitcodeprocess-returns-0 – Michael Burr Jun 03 '13 at 21:57

1 Answers1

3

Found it! The application itself was in fact returning a 0 error code...it was the shell around it that was returning 1. And that was due to that .bat file in the second argument. The name was being generated from time, so it ended up being exactly the same name if multiple instances were run too closely together. This is why the inner app would run fine...there was always a bat file there with that name. But there were access collisions when the different instances were trying to generate or clean up the bat, from what I can tell.

As a proof-of-concept hack I just added the current PID to the end of the file name, and everything worked perfectly. Now I just need to decide the real fix, which I think will likely be getting rid of the whole bat file mechanism entirely and calling the app directly.

Whew! Thanks for all the help, everyone! Sadly, the bit of code I included didn't have the offending line, but everyone's tips above helped me narrow in on the problem. :-)

teleute00
  • 561
  • 6
  • 25
  • If the batch file remains necessary, the Windows API provides a function (GetTempFileName) for generating unique temp file names: http://msdn.microsoft.com/en-us/library/windows/desktop/aa364991(v=vs.85).aspx – christopher_f Jun 04 '13 at 03:53
  • Honestly I have no idea why the batch file is even being used. There's a variable holding the command that went into the batch file, and if I use that as the second argument it seems to work perfectly fine. If anyone has any ideas why a batch file might be better, I'm all ears, but at the moment I'm just wanting to take it out. – teleute00 Jun 04 '13 at 17:32
  • CreateProcess can be a bit intimidating to work with (see this question http://stackoverflow.com/questions/1135784/createprocess-doesnt-pass-command-line-arguments for some examples of what can go wrong). The batch file was probably just a way for the previous maintainer to save themselves some headaches. If you have your solution working without the batch file, you're better off without it. – christopher_f Jun 04 '13 at 18:04