47

I was doing a quick performance test on a block of code

void ConvertToFloat( const std::vector< short >& audioBlock, 
                     std::vector< float >& out )
{
    const float rcpShortMax = 1.0f / (float)SHRT_MAX;
    out.resize( audioBlock.size() );
    for( size_t i = 0; i < audioBlock.size(); i++ )
    {
        out[i]  = (float)audioBlock[i] * rcpShortMax;
    }
}

I was happy with the speed up over the original very naive implementation it takes just over 1 msec to process 65536 audio samples.

However just for fun I tried the following

void ConvertToFloat( const std::vector< short >& audioBlock, 
                     std::vector< float >& out )
{
    const float rcpShortMax = 1.0f / (float)SHRT_MAX;
    out.reserve( audioBlock.size() );
    for( size_t i = 0; i < audioBlock.size(); i++ )
    {
        out.push_back( (float)audioBlock[i] * rcpShortMax );
    }
}

Now I fully expected this to give exactly the same performance as the original code. However suddenly the loop is now taking 900usec (i.e. it's 100usec faster than the other implementation).

Can anyone explain why this would give better performance? Does resize() initialize the newly allocated vector where reserve just allocates but does not construct? This is the only thing I can think of.

PS this was tested on a single core 2Ghz AMD Turion 64 ML-37.

Shog9
  • 156,901
  • 35
  • 231
  • 235
Goz
  • 61,365
  • 24
  • 124
  • 204

4 Answers4

72

Does resize initialize the newly allocated vector where reserve just allocates but does not construct?

Yes.

sepp2k
  • 363,768
  • 54
  • 674
  • 675
  • SGI's STL reference explains that resize "inserts or erases elements at the end", while reserve just does the memory allocation. http://www.sgi.com/tech/stl/Vector.html – user7116 Sep 22 '09 at 17:05
  • It will use whatever the allocator is set to for the vector. – user7116 Sep 22 '09 at 17:09
  • 1
    If you benchmark after the resize/reserve call you can see if this is the reason. – Laserallan Sep 22 '09 at 17:18
  • 2
    @Eduardo - this works using the Allocator for the vector (which you usually don't see becuase the default one 'just works' for most applications). Allocators have an interface that includes a function for allocating raw memory (`allocate()`) and a function for constructing an object in-place in that raw memory (`construct()`) - among other functions. `allocate()` might well be implemented by `malloc()`, but that's not a requirement. See Stephan T. Lavavej's article on "the Mallocator" for insight to how one might work: http://blogs.msdn.com/vcblog/archive/2008/08/28/the-mallocator.aspx – Michael Burr Sep 22 '09 at 17:19
  • DDJ also has a nice article by Matt Austern on Allocators: http://www.ddj.com/cpp/184403759 – Michael Burr Sep 22 '09 at 17:30
6

Resize()

Modifies the container so that it has exactly n elements, inserting elements at the end or erasing elements from the end if necessary. If any elements are inserted, they are copies of t. If n > a.size(), this expression is equivalent to a.insert(a.end(), n - size(), t). If n < a.size(), it is equivalent to a.erase(a.begin() + n, a.end()).

Reserve()

If n is less than or equal to capacity(), this call has no effect. Otherwise, it is a request for allocation of additional memory. If the request is successful, then capacity() is greater than or equal to n; otherwise, capacity() is unchanged. In either case, size() is unchanged.

Memory will be reallocated automatically if more than capacity() - size() elements are inserted into the vector. Reallocation does not change size(), nor does it change the values of any elements of the vector. It does, however, increase capacity()

Reserve causes a reallocation manually. The main reason for using reserve() is efficiency: if you know the capacity to which your vector must eventually grow, then it is usually more efficient to allocate that memory all at once rather than relying on the automatic reallocation scheme.

Gabriel L.
  • 4,678
  • 5
  • 25
  • 34
Satbir
  • 6,358
  • 6
  • 37
  • 52
4

First code writes to out[i] which boils down to begin() + i (ie. an addition). Second code uses push_back, which probably writes immediately to a known pointer equivalent to end() (ie. no addition). You could probably make the first run as fast as the second by using iterators rather than integer indexing.

Edit: also to clarify some other comments: the vector contains floats, and constructing a float is effectively a no-op (the same way declaring "float f;" does not emit code, only tells the compiler to save room for a float on the stack). So I think that any performance difference between resize() and reserve() for a vector of floats is not to do with construction.

Moha the almighty camel
  • 4,327
  • 4
  • 30
  • 53
AshleysBrain
  • 22,335
  • 15
  • 88
  • 124
  • 5
    Sorry but your construction point is untrue. float f = 0.0f; is obviously slower than just "float f;". the latter IS a nop the former is not. – Goz Sep 22 '09 at 17:16
  • Oh, fair point, didn't know constructing a float assigned it 0. Vector assigns T() to each element when resizing, which is float(), which is 0. Still, using iterators instead of integer indexing might be faster. – AshleysBrain Sep 22 '09 at 17:28
  • "constructing a float assigned it 0" is not necessarily true. It gets constructed either way. Otherwise you could never use it. Whether it gets _value-initialised_ is the question here. As for the advice to use iterators, right track, but I would suggest pointers wherever possible; that way, you ensure that you don't get any overhead from the iterator class. That's only worth doing in really hot code, though; otherwise, iterators are usually easier. – underscore_d Jun 05 '20 at 09:37
1
out.resize( audioBlock.size() );

Since out's size (= 0) is lesser than audioBlock.size() , additional elements are created and appended to the end of the out. This creates the new elements by calling their default constructor.

Reserve only allocates the memory.

Gabriel L.
  • 4,678
  • 5
  • 25
  • 34
aJ.
  • 34,624
  • 22
  • 86
  • 128