2

I am trying to write an implementation of std::vector to learn C++ and my implementation is slower than std::vector (see output).

I am wondering how I can improve it from any C++ experts. I saw this question (Why is std::vector so fast ( or is my implementation is too slow )) but his problem didn't help as the poster was using the wrong data structure.

I am asking how I can get it faster than std::vector.

vector.h

template <typename T>
class Vector {
public:
    explicit Vector(const int n);
    explicit Vector(const int n, const T& val);
    T& operator[](const int i);
    inline int const length();
    inline void fill(const T& val);
private:
    T* arr;
    int len;
};

vector.cpp

#include "vector.h"
#include <iostream>
#include <algorithm>

using namespace std;

template <typename T>
inline void Vector<T>::fill(const T& val)
{
    for (int i = 0; i < len; ++i) {
        arr[i] = val;
    }
}

template <typename T>
inline T& Vector<T>::sum()
{
    T total = 0;
    for (int i = 0; i < len; ++i) {
        total += arr[i];
    }
    return total;
}

template <typename T>
Vector<T>::Vector(const int n) : arr(new T[n]()), len(n)
{
    //cout << "Vector(n)" <<'\n';
}

template <typename T>
Vector<T>::Vector(const int n, const T& val) : arr(new T[n]), len(n)
{
    //cout << "Vector(n, val)" <<'\n';
    for (int i = 0; i < len; ++i) {
        arr[i] = val;
    }
}

template <typename T>
T& Vector<T>::operator[](const int i)
{
    return arr[i];
}

template <typename T>
int const Vector<T>::length()
{
    return len;
}

template class Vector<int>;
template class Vector<float>;

vector_test.cpp

#include "vector.h"
#include <iostream>
#include <chrono>
#include <vector>

using namespace std;

int main() 
{
    const int n = 2000000;
    float sum = 0;
    chrono::steady_clock::time_point start = chrono::steady_clock::now();   
    Vector<float> vec(n, 1);
    sum = vec.sum();
    chrono::steady_clock::time_point end = std::chrono::steady_clock::now();
    cout << "my vec sum = " << sum << '\n';
    cout << "my vec impl took " << chrono::duration_cast<chrono::microseconds>(end - start).count()
              << "us.\n";

    sum = 0;
    start = chrono::steady_clock::now();
    vector<float> vec2(n, 1);
    for (int i = 0; i < n; ++i) {
        sum += vec2[i];
    }
    end = std::chrono::steady_clock::now();
    cout << "std::vec sum = " << sum << '\n';
    cout << "stl::vec impl took " << chrono::duration_cast<chrono::microseconds>(end - start).count()
              << "us.\n";
}

Output:

my vec sum = 2e+06
my vec impl took 11040us.
std::vec sum = 2e+06
stl::vec impl took 8034us.
Community
  • 1
  • 1
jimjampez
  • 1,766
  • 4
  • 18
  • 29
  • 15
    This question, with a little reworking (possibly, not sure), may be more suitable on [code-review.se] (http://codereview.stackexchange.com/) Although, good luck getting it *faster* than std::vector – Yann Sep 29 '14 at 13:19
  • Why ``T& operator[](const int i);`` is not ``inline``? – Sergey K. Sep 29 '14 at 13:20
  • 1
    There is no need to make a ctor with 2 params ``explicit``. – Sergey K. Sep 29 '14 at 13:20
  • Sergey, there's no need but it's good practice. What if someone makes the second variable default? (At least that's what we were tought :D ) – atlanteh Sep 29 '14 at 13:25
  • 1
    1) Did you compile with optimizations? 2) What happens when you reverse the tests (first `std::vector`, then your `vector`)? 3) What happens when you put all the function definitions in the header file? – dyp Sep 29 '14 at 13:26
  • @atlanteh: If someone changes a function-signature, they will have to decide anew for those. Anyway, they need to check for conflicts. (Adding superfluous fluff is just clutter, which slows down reading by overwhelming the important things with chit-chat.) – Deduplicator Sep 29 '14 at 13:35
  • Why `sum()` returns a reference *at all* is a mystery. And unless you can think of a valid reason to have *negative* magnitudes in all of this, `std::size_t` should be the type-du-jour for sizes, indexes, etc. – WhozCraig Sep 29 '14 at 13:35
  • Try timing the constructor, and compare that. Also, if you use a for look for `vector` I suggest you do the same for `Vector`. I suspect using memset would be faster than creating and then using a for loop. I don't know.. – AlexanderBrevig Sep 29 '14 at 13:44
  • This isn't really an implementation of `vector`. As it stands right now it's an implementation of an array with fixed size after creation, and nearly all your code needs to be rewritten pretty much from the ground up to be much like `vector`. To resemble vector, you need to decouple memory allocation from object construction, which means `operator new` + placement new, *not* `new []`. – Jerry Coffin Sep 29 '14 at 14:02
  • I closed this as a duplicate, as even if the question is different, the answer is the same -- use the code review stack exchange is the right answer for this kind of question. And that was an answer to the duplicate question. – Yakk - Adam Nevraumont Sep 29 '14 at 14:03

1 Answers1

3

This is quite naive code since on every iteration the index is reevaluated (and you hope the optimizer will optimize it away):

for (int i = 0; i < len; ++i) {
    arr[i] = val;
}

Here is a somewhat better way:

T* ptr = arr;
T* end = ptr + len;
while ( ptr < end ) *ptr++ = val;

However, a good compiler will indeed do this transformation.

The same idea can be applied to Sum():

template <typename T> inline T Vector<T>::sum()
{
    T* ptr = arr;
    T* end = ptr + len;
    T total = 0;

    while ( ptr < end ) total += *ptr++;

    return total;
}
Sergey K.
  • 24,894
  • 13
  • 106
  • 174