0

I've read a fair amount on thread-safety, and have been using GCD to keep the math-heavy code off the main thread for a while now (I learned about it before NSOperation, and it seems to still be the easier option). However, I wonder if I could improve part of my code that currently uses a lock.

I have an Objective-C++ class that is a wrapper for a c++ vector. (Reasons: primitive floats are added constantly without knowing a limit beforehand, the container must be contiguous, and the reason for using a vector vs NSMutableData is "just cause" it's what I settled on, and NSMutableData will still suffer from the same "expired" pointer when it goes to resize itself).

The class has instance methods to add data points that are processed and added to the vector (vector.push_back). After new data is added I need to analyze it (by a different object). That processing happens on a background thread, and it uses a pointer directly to the vector. Currently the wrapper has a getter method that will first lock the instance (it suspends a local serial queue for the writes) and then return the pointer. For those that don't know, this is done because when the vector runs out of space push_back causes the vector to move in memory to make room for the new entries - invalidating the pointer that was passed. Upon completion, the math-heavy code will call unlock on the wrapper, and the wrapper will resume the queued writes finish.

I don't see a way to pass the pointer along -for an unknown length of time- without using some type of lock or making a local copy -which would be prohibitively expensive.

Basically: Is there a better way to pass a primitive pointer to a vector (or NSMutableData, for those that are getting hung up by a vector), that while the pointer is being used, any additions to the vector are queued and then when the consumer of the pointer is done, automatically "unlock" the vector and process the write queue

Current Implementation

Classes:

  • DataArray: a wrapper for a C++ vector
  • DataProcessor: Takes the most raw data and cleans it up before sending it to the 'DataArray'
  • DataAnalyzer: Takes the 'DataArray' pointer and does analysis on array
  • Worker: owns and initializes all 3, it also coordinates the actions (it does other stuff as well that is beyond the scope here). it is also a delegate to the processor and analyzer

What happens:

  1. Worker is listening for new data from another class that handles external devices
  2. When it receives a NSNotification with the data packet, it passes that onto DataProcessor by -(void)checkNewData:(NSArray*)data
  3. DataProcessor, working in a background thread cleans up the data (and keeps partial data) and then tells DataArray to -(void)addRawData:(float)data (shown below)
  4. DataArray then stores that data
  5. When DataProcessor is done with the current chunk it tells Worker
  6. When Worker is notified processing is done it tells DataAnalyzer to get started on the new data by -(void)analyzeAvailableData
  7. DataAnalyzer does some prep work, including asking DataArray for the pointer by - (float*)dataPointer (shown below)
  8. DataAnalyzer does a dispatch_async to a global thread and starts the heavy-lifting. It needs access to the dataPointer the entire time.
  9. When done, it does a dispatch_async to the main thread to tell DataArray to unlock the array.
  10. DataArray can is accessed by other objects for read only purposes as well, but those other reads super quick.

Code snips from DataArray

-(void)addRawData:(float)data {
    //quick sanity check
    dispatch_async(addDataQueue, ^{
        rawVector.push_back(data);
    });
}

- (float*)dataPointer {
    [self lock];
    return &rawVector[0];
}

- (void)lock {
    if (!locked) {
        locked = YES;
        dispatch_suspend(addDataQueue);
    }
}

- (void)unlock {
    if (locked) {
        dispatch_resume(addDataQueue);
        locked = NO;
    }
}

Code snip from DataAnalyzer

-(void)analyzeAvailableData {
    //do some prep work

    const float *rawArray = [self.dataArray dataPointer];
    dispatch_async(global_queue, ^{
        //lots of analysis

        //done
        dispatch_async(main_queue, ^{
            //tell `Worker` analysis is done

            [self.dataArray unlock];
        };
    };
}
Adam Jones
  • 775
  • 6
  • 23
  • If I understand right. Yours problem is that yours data stored in vector becomes invalid, when vector reallocate it's content. So there is two possible solutions: use pointer at data as vectors content or use other containers, that doesn't reallocate content: set or list for example. Don't know what you are tries to achieve so I can't say what is suit you better. – Cy-4AH May 02 '14 at 21:03
  • Close. The vector reallocation is handled by library, and the contents of the vector are copied to the new location (as memory requirements grow). The ivar in the object somehow remains valid. But any references to the vector that were passed along are now invalid. The same problem would occur if I used NSMutableData (or some other array structure); if it grows while a pointer is being passed around, the pointer is made invalid. Currently, I prevent the vector from growing while a pointer is being used. But I was curious if there was a better way than manually locking/unlocking. – Adam Jones May 02 '14 at 21:09
  • I should add that whatever array container is used, it must be mutable, as the data is always pouring in and there is no way to know when allocating how big it will get. Which is why -all- mutable containers will move themselves around in memory as they grow. – Adam Jones May 02 '14 at 21:11
  • Why you can't use list instead vector to avoid locking/unlocking? – Cy-4AH May 02 '14 at 21:35
  • Primarily I can't use a list because STL does not guarantee that a list is contiguous. As I work with the memory directly through a pointer using loops and Accelerate Framework functions, I must use a contiguous store. See: http://stackoverflow.com/a/2209564/589451 – Adam Jones May 02 '14 at 21:54
  • I know difference between vector and list. I didn't know that contiguous store is required. Too bad, because it's common mechanism to reallocate bigger buffer, when current is running out. Try get rid from contiguous rule, may be you can use data by portions or use iterators instead pointers. – Cy-4AH May 02 '14 at 22:13
  • There's no way. The performance hit from using the STL element accessors or iterators is just way too much (I tested it back when I decided on using a vector). Besides, list doesn't even have random element accessing, only front and back - which seems kinda silly considering one of the benefits of a list is you can insert randomly (which isn't something I need). – Adam Jones May 02 '14 at 22:20
  • Just wondering, but which programming language are you really using? C++ or Objective C? – Ulrich Eckhardt May 03 '14 at 10:41
  • Objective-C for everything else. Which is also part of the reason I wrapped the vector, I didn't want to have a bunch of .mm files (since Xcode doesn't like Obj-C++ as much) – Adam Jones May 03 '14 at 14:12
  • This is an issue that would probably go away with a final solution, but: "getter method that will first lock the instance (it suspends a local serial queue for the writes)". This is not safe. Suspending a queue only means it won't *start* a new task. It doesn't affect any task which is already running. – Ken Thomases May 03 '14 at 16:07
  • @KenThomases Agreed. Although, with the timing now, that should be unlikely. Unfortunately, I haven't yet figured out / seen a solution that seems to fit. See code snips I added above. – Adam Jones May 03 '14 at 16:23

3 Answers3

0

It really depends what you're doing, most of the time I drop back to the main queue (or a specifically designated queue) using dispatch_async() or dispatch_sync().

Async is obviously better, if you can do it.

It's going to depend on your specific use case but there are times when dispatch_async/dispatch_sync is multiple orders of magnitude faster than creating a lock.

The entire point of grand central dispatch (and NSOperationQueue) is to take away many of the bottlenecks found in traditional threaded programming, including locks.

Regarding your comment about NSOperation being harder to use... that's true, I don't use it very often either. But it does have useful features, for example if you need to be able to terminate a task half way through execution or before it's even started executing, NSOperation is the way to go.

Abhi Beckert
  • 32,787
  • 12
  • 83
  • 110
  • Yeah, the ability to terminate a long task is one thing that's got me looking at `NSOperation` again, because when I go to do a reanalysis of an entire recording it can take minutes. And even though it's operating on a background thread, and the UI is responsive, with GCD I have way to stop it. Unfortunately, your answer doesn't address the question of if there is a GCD (or NSOperation) approved way of sharing a volatile pointer without using a lock of some sort. – Adam Jones May 02 '14 at 22:45
  • The way I do it is use `dispatch_async()` to write to the pointer, and `dispatch_sync()` to read it. – Abhi Beckert May 02 '14 at 23:44
  • Note that `dispatch_sync()` will only create a lock when one is actually needed. Just use that, and let the kernel decide what the fastest approach is. GCD operations can be executed in many different ways, and the kernel will always pick whichever option is going to be fastest depending on what other tasks are executing at that specific nanosecond. – Abhi Beckert May 02 '14 at 23:51
  • I'm afraid I don't follow. Okay, let's say I set up a private `dispatch_queue` inside the wrapper, and the floats are added with blocks with 'dispatch_async(wrapperQueue)` so they are queued. I can't see how I would pass a pointer inside the same queue, async or sync. Because the wrapper object doesn't know about the object doing the analysis - and that analysis is being done on the default priority global thread. The analysis is called for by the object that owns both, and the analysis object then asks the wrapper for a pointer just before jumping to the background thread. – Adam Jones May 03 '14 at 00:02
0

There is a simple way to get what you need even without locking. The idea is that you have either shared, immutable data or you exclusive, mutable data. The reason why you don't need a lock for shared, immutable data is that it is simply read-only, so no race conditions during writing can occur.

All you need to do is to switch between both depending on what you currently need:

  • When you are adding samples to your storage, you need exclusive access to the data. If you already have a "working copy" of the data, you can just extend it as you need. If you only have a reference to the shared data, you create a working copy which you then keep for later exclusive access.
  • When you want to evaluate your samples, you need read-only access to the shared data. If you already have a shared copy, you just use that. If you only have an exclusive-access working copy, you convert that to a shared one.

Both of these operations are performed on demand. Assuming C++, you could use std::shared_ptr<vector const> for the shared, immutable data and std::unique_ptr<vector> for the exclusive-access, mutable data. For the older C++ standard those would be boost::shared_ptr<..> and std::auto_ptr<..> instead. Note the use of const in the shared version and that you can convert from the exclusive to the shared one easily, but the inverse is not possible, in order to get a mutable from an immutable vector, you have to copy.

Note that I'm assuming that copying the sample data is not possible and doesn't explode the complexity of your algorithm. If that doesn't work, your approach with the scrap space that is used while the background operations are in progress is probably the best way to go. You can automate a few things using a dedicated structure that works similar to a smart pointer though.

Ulrich Eckhardt
  • 16,572
  • 3
  • 28
  • 55
  • If I understand your response fully, this is exactly what I'm doing right now. But a "lock" of some sort is still necessary to do it. While I have a pointer to the data (shared, immutable) the writes to the data (exclusive, mutable) are queued up to be stored later. And when I call "unlock" the write queue is emptied. I cannot (well, will not) copy the sample data because it is HUGE and is read just a frequently as it is written to. This would be the easiest way around it (just making a copy for the analyzer) but it makes no sense to be constantly copying this massive container. – Adam Jones May 03 '14 at 15:24
  • I should say, this application is in Objective-C, and I would like to keep any C++ code hidden inside the wrapper, because Xcode doesn't like working with C++ as much. – Adam Jones May 03 '14 at 15:37
0

If you have a shared resource (your vector) which will be concurrently accessed through reads and writes from different tasks, you may associated a dedicated dispatch queue with this resource where these tasks will exclusively run.

That is, every access to this resource (read or write) will be executed on that dispatch queue exclusively. Let's name this queue "sync_queue".

This "sync_queue" may be a serial queue or a concurrent queue.

If it's a serial queue, it should be immediately obvious that all accesses are thread-safe.

If it's a concurrent queue, you can allow read accesses to happen simultaneously, that is you simply call dispatch_async(sync_queue, block):

dispatch_async(sync_queue, ^{
    if (_shared_value == 0) {
        dispatch_async(otherQueue, block);
    }
});

If that read access "moves" the value to a call-site executing on a different execution context, you should use the synchronous version:

__block int x;
dispatch_sync(sync_queue, ^{
    x = _shared_value;
});
return x;

Any write access requires exclusive access to the resource. Having a concurrent queue, you accomplish this through using a barrier:

dispatch_barrier_async(sync_queue, ^{
    _shared_value = 0;
    dispatch_async(mainQueue, ^{
        NSLog(@"value %d", _shared_value);
    });
});
CouchDeveloper
  • 18,174
  • 3
  • 45
  • 67
  • This is what I'm asking, but I'm curious if it's possible to do -inside- the wrapper - obscured from the consumers instead of making every consumer know that it must interact with the wrapper by a queue. The problem is when the analyzer asks for the pointer from the wrapper, it is during a long background-thread analysis where it needs that pointer the entire time. This is why I lock it now, and call the unlock at the end of the analysis: I don't know how long the analysis will take, and I must prevent any writes while the pointer is in use. – Adam Jones May 03 '14 at 15:28
  • @Adam, you can use something like the first snippet. Put the code which uses the array pointer into a block. Pass that block into the wrapper. The block should take the pointer as a parameter; the wrapper should never return the pointer out of its interface. The wrapper would pass the pointer in to the block. It would do so on the queue, thus ensuring that no write operation happens concurrently with it. The block that the client passes in would have to do its work synchronously. If it initiates some asynchronous work and itself returns, then the wrapper will resume allowing writes. – Ken Thomases May 03 '14 at 16:16
  • @KenThomases I think I understand what you are saying, basically don't return the pointer, return a block. But I don't understand what I could "do" with that block - or how it would be "released". See the code snips that I added for the `DataAnalyzer`. – Adam Jones May 03 '14 at 16:21
  • No, don't *return* a block, *take* a block in. `DataArray` should not have a `-dataPointer` method. It should have a `- (void) operate:(void (^)(const float*))block` method. `DataAnalyzer` would call that and pass in a block containing the `// lots of analysis` code. The `-operate:` method would queue a block to its internal queue that calls the passed-in block and provides it with `&rawVector[0]`. – Ken Thomases May 03 '14 at 16:57
  • Ah! I get it. Okay, I have two questions on implementing this: what kind of queue and how to dispatch to it? Currently the analysis is async dispatched to a global concurrent. And the queue inside the `DataArray` is a serial. So would I still async dispatch the call to `-(void) operate:` inside `DataAnalyzer` using the global concurrent? And then `-(void) operate:` would queue that, but since the call was async dispatched it shouldn't lock up the main thread, right? – Adam Jones May 03 '14 at 17:45
  • I'm playing with implementing this, and realized that I left out some details that are important to passing this block. The block accesses and modifies primitive ivars (some are arrays) that are used as state between runs (just as a personal convention I only make ObjC types properties, and I keep all primitives as ivars). And, the call back to `Worker` is a `self.delegate` call. So I can see that I would have to use `weakSelf` for the delegate accessor call, but I wonder about those ivars. I guess I would use `__block`, but does that really point back to an ivar, or just to locally scoped? – Adam Jones May 03 '14 at 18:03
  • I'm gonna select this as the answer, but after some tinkering I see that this is going to take a lot more work. I left out a lot of details that I didn't think were important to the issue at hand (like the ivars, and another array that also is locked), and getting all this passed along properly in a block makes me think there is some fundamental changes I need to make, but I'm not yet experienced enough to know how to break these massive analysis methods into smaller chunks that would work being passed around. – Adam Jones May 03 '14 at 19:22
  • It doesn't much matter whether the `-operate:` method runs the passed-in block synchronously or asynchronously. That's a design choice about the API. The block that's passed in must do its work synchronously, though, since the pointer it receives may go invalid as soon as it returns. As CouchDeveloper said, you can use either a private concurrent queue with barrier tasks or a serial queue. You can't achieve synchronization with the global concurrent queues. By the way, when you reply to somebody's comment, you need to address them (i.e. @Adam) or they won't be notified. – Ken Thomases May 05 '14 at 02:09
  • @KenThomases I apologize for missing the @ on the last 3 messages. Your explanations of CouchDeveloper's answer have been very helpful. I'm used to procedural programming, so breaking things into objects and tic/toc OO methods is still something I'm working on. Case in point, the analysis blocks are several hundred lines long (with no obvious additional functions to break out) and require a ton of state to prevent duplicate analysis. On top of that, the "regular" analysis is suited to concurrent GCD queues, but the "full" analysis would benefit from `NSOperation`'s cancel ability. – Adam Jones May 05 '14 at 12:59
  • @KenThomases I wish there were more examples of GCD and NSOperation out there. Examples by Apple and others are all ridiculously simplistic: mostly waiting for web requests or performing a simple transform on an image. All their blocks are a few lines of code and never need any state. I know there has to be more math-heavy code out there, and MacResearch has been a great resource - but it hasn't been updated in years. I think this is fundamentally the problem I'm having: I'm using these tools blind, and my use-case is too complex to describe in a stackoverflow question. – Adam Jones May 05 '14 at 14:00
  • Well, the block could just be a single line that invokes a method with the data pointer. The long analysis code could be in that method. It doesn't seem like it would be much different than the case where you get the pointer "out" of the `DataArray`. (You could also just put it all into the block if that's the way the code is already organized. There's no rule against long blocks.) The state should be in the `DataAnalyzer`'s instance variables. Shouldn't be hard to track that. – Ken Thomases May 05 '14 at 15:18