0

Comparing pushing data (1 million numbers) with and without prior resizing into std::vector. To my amusement , I found the later ( without resizing) to be faster , which is contrary to expectation. What happened ? I am using MS VC++ 2017 compiler.

double times = 1000000;

vector<double> vec1;
auto tp_start = chrono::high_resolution_clock::now();
for (double i=0; i < times; i++)
{
    vec1.push_back(i);
}
auto lapse = chrono::high_resolution_clock::now() - tp_start;
cout << chrono::duration_cast<chrono::milliseconds>(lapse).count() << " ms : push without prior resize  \n"; // 501 ms

vector<double> vec2;
vec2.resize(times);  // resizing
tp_start = chrono::high_resolution_clock::now();    
for (double i = 0; i < times; i++)
{     
       vec2[i]  = i; //fastest
      // vec2.push_back(i); //slower
}
lapse = chrono::high_resolution_clock::now() - tp_start;
cout << chrono::duration_cast<chrono::milliseconds>(lapse).count() << " ms : push with prior resizing  \n"; // 518 ms , shouldn't this be faster theoritically

Edited:

After this change: vec2.resize(times); it works faster

After this change: vec2.reserve(times); it works even faster

After this change: vec2[i] = i; becomes super fast

Any advise what is the best practice?

Edited 2 ( compiler in optimized mode)

10 million data :

 120ms : 41ms   reserve & pushback

 121ms : 35ms   resize & vec[i]

100 million data :

 1356ms : 427ms   reserve & pushback

 1345ms : 364ms   resize & vec[i]

vec[i] still wins

ark1974
  • 615
  • 5
  • 16
  • 2
    You want `.reserve()`, not `.resize()`. Your second vector ends up containing two million elements. Also; what are your compiler flags/options? Make sure you are testing an optimized build, not a debug build. – Jesper Juhl Apr 08 '18 at 05:34
  • As for the argument to the `resize` (or `reserve`) calls, it's the number of *elements*, not the size in bytes. If the size of `double` is `8` (which is common) then you will (with your current code) have *nine* million elements in the vector, of which the first eight million will all be zero. – Some programmer dude Apr 08 '18 at 05:37
  • You should also reserve `times` amount of items, not 4-8x as much. – Aki Suihkonen Apr 08 '18 at 05:38
  • With a corrected example, I get expected results: http://quick-bench.com/HLMeKeT0G67VgUYsMgyb1jU2vsE – Stephen Newell Apr 08 '18 at 05:39
  • Plz see question partially edited. – ark1974 Apr 08 '18 at 05:44
  • @ark1974 `vec2.resize(sizeof(double)* times); // resizing` is wrong, should be `resize(times)` – Mikhail Apr 08 '18 at 05:46
  • Possible duplicate: https://stackoverflow.com/questions/48535727/why-are-c-stl-vectors-1000x-slower-when-doing-many-reserves – StoryTeller - Unslander Monica Apr 08 '18 at 05:47
  • @Mikhail: Edited already – ark1974 Apr 08 '18 at 05:47
  • And there's also this https://stackoverflow.com/questions/48537812/why-does-stdvector-reserve-not-double-its-capacity-while-resize-does – StoryTeller - Unslander Monica Apr 08 '18 at 05:48
  • @ark1974 *I am using MS VC++ 2017 compiler.* -- I see no indication of whether you are testing an optimized version of your code. If you're running a "debug" or unoptimized version, your timing results are worthless. The vector class in Visual Studio has iterator checks in debug versions, and these checks slow the code down significantly. – PaulMcKenzie Apr 08 '18 at 06:09

1 Answers1

1

After this change: vec2.resize(times); it works faster

I think maybe that's because after the resize the vec2 actually holds 2*times space,so it does not need reallocating in later push_back.

After this change: vec2.reserve(times); it works even faster

this version does not need reallocating,so it's better than the first one.

After this change: vec2[i] = i; becomes super fast

this directly place the element on its position,without any redundant operations compared with push_back


origin answer

You should do like:

vector<double> vec2;
vec2.resize(sizeof(double)* times);  // this directly chage the size(),with capacity() also changed.
tp_start = chrono::high_resolution_clock::now();    
for (double i = 0; i < times; i++)
{
    vec2[i]=i;//not push_back
}  

If you still use push_back in this situation,it does not have any efficiency.

And in your example with resize,the size of vec2 is larger than vec1,which needs more time to reallocate its memory,and takes more time to do push_back.By the way it's better to do reserve(n) in your case.

choxsword
  • 3,187
  • 18
  • 44
  • I think you are right. Why it doesn't apply to push_back () in the resized vector. – ark1974 Apr 08 '18 at 05:38
  • 1
    @bigxiao `reserve` allocates memory and sets the capacity, but not the size. `resize` both allocates memory, sets the capacity *and* the size. The size must always be less than or equal to the capacity. – Some programmer dude Apr 08 '18 at 05:41
  • *"this chage the size(),not capacity()"* - So what if the size requested is greater than the capacity? – StoryTeller - Unslander Monica Apr 08 '18 at 05:44
  • @StoryTeller,@Some programmer dude ,Thanks for your reminding. – choxsword Apr 08 '18 at 05:46
  • @bigxiao: +1 for the vec2[i] = i; But reserve() don't work in this case. – ark1974 Apr 08 '18 at 05:54
  • @ark1974 `vec2[i]=i` needs `resize`,which will allocate memory,initialize the elements and thus take some extra time(not measured in your example). I think `reserve()` is better than your first version without `resize` and `reserve`. – choxsword Apr 08 '18 at 05:59