29

I have an iOS application that's crashing on calls like __destroy_helper_block_253 and __destroy_helper_block_278 and I'm not really sure what either "destroy_helper_block" is referencing or what the number after it is supposed to point to.

Does anyone have any pointers for how to track down where exactly these crashes might be occuring?

Here's an example traceback (note that the lines with __destroy_helper_block only references the file it's contained in and nothing else, when normally the line number would be included as well).

Thread : Crashed: com.apple.root.default-priority
0  libdispatch.dylib              0x000000018fe0eb2c _dispatch_semaphore_dispose + 60
1  libdispatch.dylib              0x000000018fe0e928 _dispatch_dispose + 56
2  libdispatch.dylib              0x000000018fe0e928 _dispatch_dispose + 56
3  libdispatch.dylib              0x000000018fe0c10c -[OS_dispatch_object _xref_dispose] + 60
4  Example App                    0x00000001000fe5a4 __destroy_helper_block_278 (TSExampleApp.m)
5  libsystem_blocks.dylib         0x000000018fe53908 _Block_release + 256
6  Example App                    0x00000001000fda18 __destroy_helper_block_253 (TSExampleApp.m)
7  libsystem_blocks.dylib         0x000000018fe53908 _Block_release + 256
8  libdispatch.dylib              0x000000018fe0bfd4 _dispatch_client_callout + 16
9  libdispatch.dylib              0x000000018fe132b8 _dispatch_root_queue_drain + 556
10 libdispatch.dylib              0x000000018fe134fc _dispatch_worker_thread2 + 76
11 libsystem_pthread.dylib        0x000000018ffa16bc _pthread_wqthread + 356

Edit 1: Here's an example of one of the blocks defined in the file where the crash occurs (with application-specific code edited out).

- (void)doSomethingWithCompletion:(void (^)())completion {
    void (^ExampleBlock)(NSString *) = ^{
        NSNotification *notification = [NSNotification notificationWithName:kExampleNotificationName object:nil userInfo:nil];
        [[NSNotificationCenter defaultCenter] postNotification:notification];

        if (completion) {
            completion();
        }
    };

    // Async network call that calls ExampleBlock on either success or failure below...
}

There are many other blocks in the file, but most of them are provided as arguments to methods instead of being defined first and then referenced later.

Edit 2: Added more context to above function.

Dan Loewenherz
  • 10,879
  • 7
  • 50
  • 81
  • See http://stackoverflow.com/questions/9273954/blocks-within-nsoperation – rmaddy May 09 '14 at 20:09
  • 2
    Most likely you retained the block incorrectly. Make sure to `copy` it for storage. – Léo Natan May 09 '14 at 21:02
  • @LeoNatan I assumed that much, but I was wondering how I could track down which block I should be looking at. There are dozens of them within this file. I was thinking that maybe that number suffix might be some indication but I can't figure out what it's supposed to mean. – Dan Loewenherz May 09 '14 at 21:04
  • 1
    The number is a compiler assigned identifier. And the number you have is of the destroyer helper block, not the block itself. I would enable zombies and try to reproduce the crash to see which object was released. May show you which block was released prematurely. – Léo Natan May 09 '14 at 21:09
  • Did you try the static analyser? It's quite good at finding all kinds of problems. Do you have any retain properties containing blocks? Bad mistake, that will crash. Retaining any blocks instead of copying? – gnasher729 May 09 '14 at 21:47
  • @gnasher729 Haven't tried the static analyzer...will give that a shot. Re: retaining block properties, no. – Dan Loewenherz May 09 '14 at 21:50
  • Static analyzer found no issues in this file. I'll start a bounty. – Dan Loewenherz May 11 '14 at 23:19
  • 3
    What about enabling zombie objects? – Rivera May 12 '14 at 00:55
  • I really recommend you to enable zombies as Rivera and LeoNathan suggested. Usually my my problems with block are due to zombies or by using a block even if I didn't set. If you use a block when it is NULL it launch an exception, always check they existence before launching them. – Andrea May 12 '14 at 09:06
  • 2
    another suggestion: enable exception breakpoints – Jacopo Berta May 12 '14 at 09:30
  • @Andrea The biggest problem is that I haven't been able to recreate the bug on my end. I need to use the traceback to track it down to a line number if possible. – Dan Loewenherz May 12 '14 at 12:59
  • do lines `278` and `253` correspond to line numbers containing blocks in `TSExampleApp.m`? – Brad Allred May 13 '14 at 15:13
  • @BradAllred no, they do not – Dan Loewenherz May 13 '14 at 15:38
  • @Dan Sorry, I now realize you already had this thought. I will try to come back with something more helpful next time :) – Brad Allred May 13 '14 at 15:49
  • @BradAllred no worries! – Dan Loewenherz May 13 '14 at 17:25
  • Code for your blocks would be helpful. Something such as not unregistering an observer or a not-invalidated timer could be some reasons. – Rivera May 14 '14 at 05:08
  • can you post the code that creates the block in `TSExampleApp.m`? – Bryan Chen May 14 '14 at 05:19
  • [Understanding and Analyzing iOS Application Crash Reports](https://developer.apple.com/library/ios/technotes/tn2151/_index.html) – Hemang May 14 '14 at 05:23
  • @Rivera Just posted some example code of the blocks that are defined explicitly instead of being passed right to other methods. – Dan Loewenherz May 14 '14 at 12:25
  • 3
    Maybe try posting to `NSNotificationCenter` in a `dispatch_async` on the main queue? – mattt May 14 '14 at 15:33
  • @mattt I'll give that a shot and follow up. Thanks! – Dan Loewenherz May 14 '14 at 20:01
  • 1
    What's the `completion` block and where is it defined? Looks unusual to call that block from inside your block. Maybe more code/context would help understanding. – Rivera May 15 '14 at 01:15
  • @Rivera This block (`ExampleBlock`) is called after a network call, in the case of both success or failure. `completion()` is called to pass execution back to the original block. I'll add an edit to illustrate. – Dan Loewenherz May 15 '14 at 13:44
  • 1
    To expand on Leo Natan's comment: Is it possible that this happens: (1) `doSomethingWithCompletion:` creates `ExampleBlock`. (2) You start some asynchronous network operation. (3) `doSomethingWithCompletion:` returns, and `ExampleBlock` is released. (4) The asynchronous network operation finishes, and calls `ExampleBlock`. In this case, the pointer to the block would get dereferenced after it's been deallocated. (Perhaps this is intermittent based on whether the autorelease pool has drained.) In other words: do you guarantee that each block is retained until it's no longer needed? – Aaron Brager May 16 '14 at 03:00
  • In your crash logs, do you see anything resembling `"Semaphore/group object deallocated while in use"`? – CodaFi May 16 '14 at 19:36
  • @CodaFi -- no, unfortunately not – Dan Loewenherz May 16 '14 at 20:11
  • @AaronBrager I believe that may be possible... – Dan Loewenherz May 16 '14 at 20:12
  • Then are you keeping around a queue or semaphore as an iVar somewhere for this object or for an object in the frame? – CodaFi May 16 '14 at 20:24
  • @CodaFi Nope, not that I can see. No semaphores / queues are instantiated in this frame. Just this block. – Dan Loewenherz May 16 '14 at 20:44

7 Answers7

43

Each frame of the stack trace should give you a clue as to what libDispatch is doing to cause the crash. Working our way from the bottom:

11 libsystem_pthread.dylib        0x000000018ffa16bc _pthread_wqthread
10 libdispatch.dylib              0x000000018fe134fc _dispatch_worker_thread2 + 76

These two functions spin up a worker thread and runs it. In the process, it also sets up an autorelease pool for the thread.

9  libdispatch.dylib              0x000000018fe132b8 _dispatch_root_queue_drain + 556

This function signals the start of the queue destruction process. The thread-specific autorelease pool is drained, and in the process all variables referenced by that particular queue are released. Because this is libDispatch, that means the underlying mach object and the work block you submitted have got to go...

7  libsystem_blocks.dylib         0x000000018fe53908 _Block_release + 256
6  Example App                    0x00000001000fda18 __destroy_helper_block_253 (TSExampleApp.m)
5  libsystem_blocks.dylib         0x000000018fe53908 _Block_release + 25
4  Example App                    0x00000001000fe5a4 __destroy_helper_block_278 (TSExampleApp.m)

which is precisely what happens here. Number 7 is the outer block and because it contains a nontrivial object to destroy (yet another block), the compiler generated a destructor (__destroy_helper_block_253) to get rid of that inner block too. Applying the same line of logic, we can deduce that the inner block has yet another bit of nontrivial destroying to do.

3  libdispatch.dylib              0x000000018fe0c10c -[OS_dispatch_object _xref_dispose] + 60
2  libdispatch.dylib              0x000000018fe0e928 _dispatch_dispose + 56
1  libdispatch.dylib              0x000000018fe0e928 _dispatch_dispose + 56

These lines are the root cause of all your troubles. For some reason, you've either captured the queue you're calling back on, or you've captured an object that holds a reference to a queue weakly such that when it goes the way of the dinosaur, it takes its queue with it. This causes libDispatch to assume the queue is done for and it keeps on dealloc'ing until it reaches the semaphore-specific dispose

0  libdispatch.dylib              0x000000018fe0eb2c _dispatch_semaphore_dispose + 60

With no semaphore to release, mach will complain enough not to return KERN_SUCCESS on semaphore destruction, which is a fatal error in libDispatch. In fact, it will abort() in such a case -well, technically __builtin_trap(), but they accomplish the same goal. Because there's no debugger attached, down goes your app.

So this raises the question then: how do you fix this? Well, first you need to find what, if anything is referencing a dispatch object. You mentioned that you were doing some asynchronous networking, so that would be the place to check first. If any of those objects happens to hold a queue or semaphore, or references an object that does, and you aren't capturing it strongly in any of those blocks, this is precisely what happens when the block passes out of scope along with the object.

CodaFi
  • 43,043
  • 8
  • 107
  • 153
  • Is it correct that your analysis says, that the crash occurs while deallocating the block `completion`? (IMHO, this is the case). – CouchDeveloper May 18 '14 at 09:56
  • Yes and no. The root block is a dispatch block, but it's possible that either the completion or the example block is the inner one capturing the queue or semaphore. You're right about us needing to see more code to provide a better diagnosis. – CodaFi May 18 '14 at 17:00
  • Great answer. Even though I can't extract a line number from this, your analysis seems to be the closest as I can get. Thanks! – Dan Loewenherz May 18 '14 at 20:18
  • 1
    No problem. It's always unfortunate when we can't provide more information without seeing potentially NDA'd or private source code, but I gave it my best given the information I had – CodaFi May 18 '14 at 20:39
  • I still cannot understand from the stack traces above. So, what was the actual issue ? Not copying the block or strong pointers inside block. I have exact same scenario and your input would help me to get insight of the problem that I am facing. – Sandeep Sep 25 '15 at 16:44
  • @GeneratorOfOne Here, the issue is neither. By my analysis, the *queue* is being captured weakly in some way or another such that, when the object dies and takes the queue with it, it beats the destructor to the underlying mach object. – CodaFi Sep 25 '15 at 17:30
  • Is it some how possible to reproduce this bug myself. Perhaps writing some kind of code to have this crash would help me to dig into the problem. – Sandeep Sep 30 '15 at 10:48
  • Hi @CodaFi how would we need to capture a `queue` weekly? It doesn't make sense to me. – allenlinli May 29 '18 at 10:36
2

There's not a lot to go on here, but my suspicion is that the block is never getting moved to the heap. Blocks are created on the stack by default. The compiler can often figure out when to move them to heap, but the way you're handing this one from block to block probably is never doing that.

I would add a completionCopy = [completion copy] to force it onto the heap. Then work with completionCopy. See bbum's answer regarding storing blocks in dictionaries. With ARC, you don't need to call Block_copy() and Block_release() anymore, but I think you still want to call -copy here.

Community
  • 1
  • 1
Rob Napier
  • 286,113
  • 34
  • 456
  • 610
  • Wouldn't that cause KERN_INVALID_ADDRESSes to crop up when the dtor tries to grab something out of frame? The block is alive, the things referenced by the block aren't. – CodaFi May 16 '14 at 19:42
  • The block `ExampleBlock` will be copied to the heap when it is captured by another block (unless it is already on the heap). Since there is some completion handler which - as the OP states - "calls this block" and if this completion handler is a Block itself, then we can be pretty sure that the block `ExampleBlock`is on the heap. – CouchDeveloper May 17 '14 at 13:33
1

Hypothesis:

  1. doSomethingWithCompletion: creates ExampleBlock.
  2. You start some asynchronous network operation.
  3. doSomethingWithCompletion: returns, and ExampleBlock is released.
  4. The asynchronous network operation finishes, and calls ExampleBlock.

In this case, the pointer to the block would get dereferenced after it's been deallocated. (Perhaps this is intermittent based on whether the autorelease pool has drained, or whether other nearby memory areas have been released.)

3 possible solutions:

1. Store the Block in a property

Store the block in a property:

@property (nonatomic, copy) returnType (^exampleBlock)(parameterTypes);

Then in code,

self.exampleBlock = …

One problem with this approach is that you can only ever have one exampleBlock.

2. Store the Block in an array

To work around this problem, you can store the blocks in a collection (like NSMutableArray):

@property (nonatomic, strong) NSMutableArray *blockArray;

then in code:

self.blockArray = [NSMutableArray array];

// Later on…
[self.blockArray addObject:exampleBlock];

You can remove the block from the array when it's OK to deallocate it.

3. Work around the storage problem by simply passing the block around

Instead of managing storing and destroying your blocks, refactor your code so exampleBlock is passed around between the various methods until your operation finishes.

Alternatively, you could use NSBlockOperation for the asynchronous code, and set its completionBlock for the response-finished code, and add it to an NSOperationQueue.

Community
  • 1
  • 1
Aaron Brager
  • 65,323
  • 19
  • 161
  • 287
  • It's unlikely, that your hypothesis holds true: If the block `ExampleBlock` is executed in another block (the completion block of the network request (and I suppose, this is indeed the case) then the block `ExampleBlock` will be retained at the time when the completion-block literal expression will be evaluated, and released when the completion block finished. (Unless ARC is disabled or a Block isn't an object (prior to iOS 6). – CouchDeveloper May 17 '14 at 13:29
1

I think the completion gets released in your async call that might be causing the crash.

Anand
  • 864
  • 10
  • 31
1

I would suspect, the issue is not in your code but elsewhere.

One possible issue is this:

IFF there are UIKit objects which are captured in the Block completion you possibly get a subtle bug when the block is executed on a non-main thread AND this block keeps the last strong reference to those UIKit objects:

When the Block completion finishes, it's block get deallocated and along with this, all imported variables get "destroyed" which means in case of retainable pointers, that they receive a release message. If this was the last strong reference, the captured object gets deallocated which will happen in a non-main thread - and this can be fatal for UIKit objects.

Aaron Brager
  • 65,323
  • 19
  • 161
  • 287
CouchDeveloper
  • 18,174
  • 3
  • 45
  • 67
  • Wouldn't UIKit frames show up in the trace then? This looks like a libDispatch worker thread failing to spin down. UIKit doesn't forbid threading, it just keeps around so much lockless global state that threading makes access and modification exceptionally unstable. – CodaFi May 17 '14 at 19:21
  • @CodaFi _Wouldn't UIKit frames show up in the trace then?_ Possibly yes, but not necessarily due to some subtle side effects from race conditions. But in order to investigate the issue, we would need more code, and a way to debug it. – CouchDeveloper May 18 '14 at 09:41
0

I don't see anything wrong with the code posted and think that the bug is somewhere else.

Also nesting blocks seems unnecessary and complicates memory management and probably makes finding the cause of the crash more difficult.

Why don't you begin by moving the code in your ExampleBlock directly to the completion block?

Rivera
  • 10,792
  • 3
  • 58
  • 102
0

what about this solution: if you will call some block later not in current scope, then you ought to call copy on it to move this block to the heap from the stack

- (void)doSomethingWithCompletion:(void (^)())completion {
    void (^ExampleBlock)(NSString *) = [^{
        NSNotification *notification = [NSNotification notificationWithName:kExampleNotificationName object:nil userInfo:nil];
        [[NSNotificationCenter defaultCenter] postNotification:notification];

        if (completion) {
            completion();
        }
    } copy];

    // Async network call that calls ExampleBlock on either success or failure below...
}
Nikolay Shubenkov
  • 3,133
  • 1
  • 29
  • 31