2

TL;DR: Does the __block attribute on an std::vector prevent RVO in Objective-C++?

In Modern C++, the canonical way to return a vector from a function is to just return it by value so that return value optimization can be used if possible. In Objective-C++, this appears to work the same way.

- (void)fetchPeople {
  std::vector<Person> people = [self readPeopleFromDatabase];
}

- (std::vector<Person>)readPeopleFromDatabase {
  std::vector<Person> people;

  people.emplace_back(...);
  people.emplace_back(...);

  // No copy is made here.
  return people;
}

However, if the __block attribute is applied to the second vector, then it appears that a copy of the vector is being created when it returns. Here is a slightly contrived example:

- (std::vector<Person>)readPeopleFromDatabase {
  // __block is needed to allow the vector to be modified.
  __block std::vector<Person> people;

  void (^block)() = ^ {
    people.emplace_back(...);
    people.emplace_back(...);
  };


  block();

  #if 1

  // This appears to require a copy.
  return people;

  #else

  // This does not require a copy.
  return std::move(people);

  #endif
}

There are plenty of Stack Overflow questions that explicitly state that you don't need to use std::move when returning a vector because that will prevent copy elision from taking place.

However, this Stack Overflow question states that there are, indeed, some times when you do need to explicitly use std::move when copy elision is not possible.

Is the use of __block in Objective-C++ one of those times when copy elision is not possible and std::move should be used instead? My profiling appears to confirm that, but I'd love a more authoritative explanation.

(This is on Xcode 10 with C++17 support.)

kennyc
  • 5,490
  • 5
  • 34
  • 57

1 Answers1

1

I don't know about authoritative, but a __block variable is specifically designed to be able to outlive the scope it's in and contains special runtime state that tracks whether it's stack- or heap-backed. For example:

#include <iostream>
#include <dispatch/dispatch.h>

using std::cerr; using std::endl;
struct destruct_logger
{
    destruct_logger()
    {}
    destruct_logger(const destruct_logger& rhs)
    {
        cerr << "destruct_logger copy constructor: " << &rhs << " --> " << this << endl;
    }
  void dummy() {}
  ~destruct_logger()
    {
        cerr << "~destruct_logger on " << this << endl;
    }
};

void my_function()
{
    __block destruct_logger logger;

    cerr << "Calling dispatch_after, &logger = " << &logger << endl;
    dispatch_after(
      dispatch_time(DISPATCH_TIME_NOW, (int64_t)(1 * NSEC_PER_SEC)), dispatch_get_main_queue(),
        ^{
            cerr << "Block firing\n";
            logger.dummy();
        });
    cerr << "dispatch_after returned: &logger = " << &logger << endl;
}

int main(int argc, const char * argv[])
{
    my_function();
    cerr << "my_function() returned\n";
    dispatch_main();
    return 0;
}

If I run that code, I get the following output:

Calling dispatch_after, &logger = 0x7fff5fbff718
destruct_logger copy constructor: 0x7fff5fbff718 --> 0x100504700
dispatch_after returned: &logger = 0x100504700
~destruct_logger on 0x7fff5fbff718
my_function() returned
Block firing
~destruct_logger on 0x100504700

There's quite a lot happening here:

  • Before we call dispatch_after, logger is still stack-based. (0x7fff… address)
  • dispatch_after internally performs a Block_copy() of the block which captures logger. This means the logger variable must now be moved to the heap. As it's a C++ object, this means the copy constructor is invoked.
  • And indeed, after dispatch_after returns, &logger now evaluates to the new (heap) address.
  • The original stack instance of course must be destroyed.
  • The heap instance is only destroyed once the capturing block has been destroyed.

So a __block "variable" is actually a much more complex object that can move around in memory on demand behind the scenes.

If you were to subsequently return logger from my_function, RVO wouldn't be possible, because (a) it now lives on the heap, not the stack, and (b) not making a copy on returning would allow mutation of the instance captured by blocks.

I guess it might be possible to make it runtime state dependent - use RVO memory for stack-backing, then if it gets moved to the heap, copy back into the return value when the function returns. But this would complicate functions that operate on blocks, as the backing state would now need to be stored separately from the variable. It also seems like overly complex and surprising behaviour, so I'm not surprised that RVO doesn't happen for __block variables.

pmdj
  • 22,018
  • 3
  • 52
  • 103
  • Thanks for the analysis and great contribution. I was reading elsewhere that if RVO is not possible, then a compiler can fallback and perform a move instead. Manually specifying a move works fine, but perhaps this is just one of those cases where it can't easily be done by the compiler because of the semantics you've outlined. Thankfully, it's an easy fix to get the performance back. – kennyc Aug 02 '18 at 19:34