5

I'm invoking various command line tools via NSTask. The tools may run for several seconds, and output text constantly to stdout. Eventually, the tool will terminate on its own. My app reads its output asynchronously with readInBackgroundAndNotify.

If I stop processing the async output as soon as the tool has exited, I will often lose some of its output that hasn't been delivered by then.

Which means I have to wait a little longer, allowing the RunLoop to process pending read notifications. How do I tell when I've read everything the tool has written to the pipe?

This problem can be verified in the code below by removing the line with the runMode: call - then the program will print that zero lines were processed. So it appears that at the time the process has exited, there's already a notification in the queue that is waiting to be delivered, and that delivery happens thru the runMode: call.

Now, it might appear that simply calling runMode: once after the tool's exit may be enough, but my testing shows that it isn't - sometimes (with larger amounts of output data), this will still only process parts of the remaining data.

Note: A work-around such as making the invoked tool outout some end-of-text marker is not a solution I seek. I believe there must be some proper way to do this, whereby the end of the pipe stream is signalled somehow, and that's what I'm looking for in an answer.

Sample Code

The code below can be pasted into a new Xcode project's AppDelegate.m file.

When run, it invokes a tool that generates some longer output and then waits for the termination of the tool with waitUntilExit. If it would then immediately remove the outputFileHandleReadCompletionObserver, most of the tool's output would be missed. By adding the runMode: invocation for the duration of a second, all output from the tool is received - Of course, this timed loop is less than optimal.

And I would like to keep the runModal function synchronous, i.e. it shall not return before it has received all output from the tool. It does run in its own tread in my actual program, if that matters (I saw a comment from Peter Hosey warning that waitUntilExit would block the UI, but that would not be an issue in my case).

- (void)applicationDidFinishLaunching:(NSNotification *)aNotification
{
    [self runTool];
}

- (void)runTool
{
    // Retrieve 200 lines of text by invoking `head -n 200 /usr/share/dict/words`
    NSTask *theTask = [[NSTask alloc] init];
    theTask.qualityOfService = NSQualityOfServiceUserInitiated;
    theTask.launchPath = @"/usr/bin/head";
    theTask.arguments = @[@"-n", @"200", @"/usr/share/dict/words"];

    __block int lineCount = 0;

    NSPipe *outputPipe = [NSPipe pipe];
    theTask.standardOutput = outputPipe;
    NSFileHandle *outputFileHandle = outputPipe.fileHandleForReading;
    NSString __block *prevPartialLine = @"";
    id <NSObject> outputFileHandleReadCompletionObserver = [[NSNotificationCenter defaultCenter] addObserverForName:NSFileHandleReadCompletionNotification object:outputFileHandle queue:nil usingBlock:^(NSNotification * _Nonnull note)
    {
        // Read the output from the cmdline tool
        NSData *data = [note.userInfo objectForKey:NSFileHandleNotificationDataItem];
        if (data.length > 0) {
            // go over each line
            NSString *output = [[NSString alloc] initWithData:data encoding:NSUTF8StringEncoding];
            NSArray *lines = [[prevPartialLine stringByAppendingString:output] componentsSeparatedByString:@"\n"];
            prevPartialLine = [lines lastObject];
            NSInteger lastIdx = lines.count - 1;
            [lines enumerateObjectsUsingBlock:^(NSString *line, NSUInteger idx, BOOL * _Nonnull stop) {
                if (idx == lastIdx) return; // skip the last (= incomplete) line as it's not terminated by a LF
                // now we can process `line`
                lineCount += 1;
            }];
        }
        [note.object readInBackgroundAndNotify];
    }];

    NSParameterAssert(outputFileHandle);
    [outputFileHandle readInBackgroundAndNotify];

    // Start the task
    [theTask launch];

    // Wait until it is finished
    [theTask waitUntilExit];

    // Wait one more second so that we can process any remaining output from the tool
    NSDate *endDate = [NSDate dateWithTimeIntervalSinceNow:1];
    while ([NSDate.date compare:endDate] == NSOrderedAscending) {
        [[NSRunLoop currentRunLoop] runMode:NSDefaultRunLoopMode beforeDate:[NSDate dateWithTimeIntervalSinceNow:0.1]];
    }

    [[NSNotificationCenter defaultCenter] removeObserver:outputFileHandleReadCompletionObserver];

    NSLog(@"Lines processed: %d", lineCount);
}
Thomas Tempelmann
  • 11,045
  • 8
  • 74
  • 149

2 Answers2

1

It's quite simple. In the observer block when data.length is 0 remove the observer and call terminate.

The code will continue after the waitUntilExit line.

- (void)runTool
{
    // Retrieve 20000 lines of text by invoking `head -n 20000 /usr/share/dict/words`
    const int expected = 20000;
    NSTask *theTask = [[NSTask alloc] init];
    theTask.qualityOfService = NSQualityOfServiceUserInitiated;
    theTask.launchPath = @"/usr/bin/head";
    theTask.arguments = @[@"-n", [@(expected) stringValue], @"/usr/share/dict/words"];

    __block int lineCount = 0;
    __block bool finished = false;

    NSPipe *outputPipe = [NSPipe pipe];
    theTask.standardOutput = outputPipe;
    NSFileHandle *outputFileHandle = outputPipe.fileHandleForReading;
    NSString __block *prevPartialLine = @"";
    [[NSNotificationCenter defaultCenter] addObserverForName:NSFileHandleReadCompletionNotification object:outputFileHandle queue:nil usingBlock:^(NSNotification * _Nonnull note)
    {
        // Read the output from the cmdline tool
        NSData *data = [note.userInfo objectForKey:NSFileHandleNotificationDataItem];
        if (data.length > 0) {
            // go over each line
            NSString *output = [[NSString alloc] initWithData:data encoding:NSUTF8StringEncoding];
            NSArray *lines = [[prevPartialLine stringByAppendingString:output] componentsSeparatedByString:@"\n"];
            prevPartialLine = [lines lastObject];
            NSInteger lastIdx = lines.count - 1;
            [lines enumerateObjectsUsingBlock:^(NSString *line, NSUInteger idx, BOOL * _Nonnull stop) {
                if (idx == lastIdx) return; // skip the last (= incomplete) line as it's not terminated by a LF
                // now we can process `line`
                lineCount += 1;
            }];
        } else {
            [[NSNotificationCenter defaultCenter] removeObserver:self name:NSFileHandleReadCompletionNotification object:nil];
            [theTask terminate];
            finished = true;
        }
        [note.object readInBackgroundAndNotify];
    }];

    NSParameterAssert(outputFileHandle);
    [outputFileHandle readInBackgroundAndNotify];

    // Start the task
    [theTask launch];

    // Wait until it is finished
    [theTask waitUntilExit];

    // Wait until all data from the pipe has been received
    while (!finished) {
        [[NSRunLoop currentRunLoop] runMode:NSDefaultRunLoopMode beforeDate:[NSDate dateWithTimeIntervalSinceNow:0.0001]];
    }

    NSLog(@"Lines processed: %d (should be: %d)", lineCount, expected);
}
Thomas Tempelmann
  • 11,045
  • 8
  • 74
  • 149
vadian
  • 274,689
  • 30
  • 353
  • 361
  • This does not meet my need to have the `runTool` function to be synchronous, as it would return before all data has been received, but I can probably solve that by adding a semaphore. –– Actually, your posted code is not outputting what it's supposed to do, compared to my original. – Thomas Tempelmann Jan 12 '20 at 19:25
  • Actually `waitUntilExit` makes it synchronous, the `NSLog` line is called after all data are received. And your method does not return anything. – vadian Jan 12 '20 at 19:29
  • In my testing, your code outputs "Lines processed: 0", but it should say 200. Does it show 200, not 0, for you? – Thomas Tempelmann Jan 12 '20 at 19:34
  • Then it must be a race condition, because in my test it only prints 200 if I still include the `runMode:beforeDate:` call between `waitUntilExit` and the last line. – Thomas Tempelmann Jan 12 '20 at 19:36
  • I/ using this pattern in a few projects to read data asynchronously from the shell. However I'm using a strong reference to the task (an instance variable) – vadian Jan 12 '20 at 19:40
  • To fix the issue, I have to add a "done" flag that I in the callback, then have a `while (!done)?`loop after `waitUntilExit`, calling `[[NSRunLoop currentRunLoop] runMode:NSDefaultRunLoopMode beforeDate:[NSDate dateWithTimeIntervalSinceNow:0.00001]];`. On my machine, this will invoke the loop 3 times before it finds that "done" is set. – Thomas Tempelmann Jan 12 '20 at 19:41
  • Is this a command line interface (CLI)? Those behave different – vadian Jan 12 '20 at 19:44
  • I do believe that you're wrong on the assumption that `waitUntilExit` will return only if you call `[theTask terminate]`. Instead, the task may terminate sooner, and then we have the race condition that requires extra waiting. At least, your added check for zero-length data seems to do the trick that I was looking for. I am testing this on 10.13.6 on a very fast Mac, BTW. – Thomas Tempelmann Jan 12 '20 at 19:44
  • I run the test code as a NSApp, from within `applicationDidFinishLaunching:` – Thomas Tempelmann Jan 12 '20 at 19:45
  • Would you please add the done flag and while loop to your code? Otherwise, it's not really correctly working, and accepting it as the right answer would give the wrong result to others if they find this answer. Try increasing the amount of returned lines from the invoked tool, i.e. change the "200" to "20000" - hopefully you'll then see the issue as well. – Thomas Tempelmann Jan 12 '20 at 19:57
  • There is the same issue but couldn't it be that your way to parse the text is the reason. I'm using this pattern for example to display the progress of an `rsync` copy process and it shows everything reliably. – vadian Jan 12 '20 at 20:11
  • I'm using the exact sample code above in a new "Cocoa App" project, pasted into AppDelegate.m, calling it from `applicationDidFinishLaunching:`, and built with Xcode 10.1 on 10.13.6, as well as Xcode10.2.1 on 10.14.6, on different Macs, and I get the issue that not all lines are counted. When I run the same built app on 10.15.2 (in VMware Fusion), it shows the same issue, too. I don't understand why you don't see this happen. Here's a copy of the prj: http://files.tempel.org/tmp/NSTaskOutput.zip – Thomas Tempelmann Jan 13 '20 at 11:14
  • I do *see this happen* but once again could it be that the code to count the lines causes the behavior? – vadian Jan 13 '20 at 11:18
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/205960/discussion-between-thomas-tempelmann-and-vadian). – Thomas Tempelmann Jan 14 '20 at 19:27
0

The problem with waitUntilExit is that it doesn't always behave the way one might think. The following is mentioned in the documenation:

waitUntilExit does not guarantee that the terminationHandler block has been fully executed before waitUntilExit returns.

It appears this is precisely the problem you are having; it's a race condition. The waitUntilExit is not waiting long enough and the lineCount variable is reached before the NSTask completes. The solution would likely be to use a semaphore or dispatch_group, although it's unclear if you want to go that route — this is not an easy problem to resolve it seems.

*I experienced a similar issue from months back that still isn't resolved unfortunately.

l'L'l
  • 44,951
  • 10
  • 95
  • 146
  • There is no event where you could release the semaphore, other than what vadian's answer has already revealed, i.e. in the data receive handler, when the data length is zero. Also, where do you think this `terminationHandler` block is involved? My demo code calls a cmdline tool (which is plain C, without such a block), nor does my Cocoa code involve it. While you quote some partially relevant docs, nothing in your answer is helpful to solving the problem as far as I can see. – Thomas Tempelmann Jan 14 '20 at 01:28
  • Since you mention having had similar problems yourself in the past - have you understood vadian's solution, along with my comments? IMO, the fact that `waitUntilExit` seems to return too soon is not a bug, but the only problem is that Apple has not clearly documented how to detect the EOF of the async pipe. vadian shows a solution (i.e. check for length==zero), but I wonder if that's been documented somewhere. It works in tests across all macOS versions, at least. – Thomas Tempelmann Jan 14 '20 at 01:32
  • @ThomasTempelmann: The `waitUntilExit` is the entire problem, and Apple does not give a solution. I didn’t try vadians suggestion, although it seems logical that it could work for this particular scenario. The problem still exists regardless, and for situations that aren’t reading a file it becomes worse. If the NSTask is doing something else then what is the solution? This is where detecting EOF isn’t necessarily going to help. – l'L'l Jan 14 '20 at 18:19