I'm writing numerical code, in which it is useful to define vector operations. E.g if x and y are n-long vectors full of floats, it is nice to have x^y result in a in the ith element of y equaling some arbitary function of the ith element of x. One simple way of doing this is:
#include <vector>
#include <stdio.h>
#include <ctime>
using namespace std;
template <typename T>
void operator^(vector<T> A, vector<T> B){
typename vector<T>::iterator a = A.begin();
typename vector<T>::iterator b = B.begin();
while(a!=A.end()){
*b = 2*(*a);
a++; b++;
}
//for (uint i=0; i<A.size(); i++)
//B[i] = 2*A[i];
}
int main(int argc, char** argv){
int n = 10000;
int numRuns = 100000;
vector<float> A;
for (int i=0; i<n; i++)
A.push_back((float) i);
vector<float> B = vector<float>(n);
clock_t t1 = clock();
for (int i=0; i<numRuns; i++)
for (int j=0; j<n; j++)
B[j] = 2*A[j];
clock_t t2 = clock();
printf("Elapsed time is %f seconds\n", double(t2-t1)/CLOCKS_PER_SEC);
t1 = clock();
for (int i=0; i<numRuns; i++)
B^A;
t2 = clock();
printf("Elapsed time is %f seconds\n", double(t2-t1)/CLOCKS_PER_SEC);
return 0;
}
Now, when run on my computer after -O3 compiling, the output is
Elapsed time is 0.370000 seconds
Elapsed time is 1.170000 seconds
If instead I use the commented out lines in the template, the second time is ~1.8 seconds. My question is: how do I speed up the operator call? Ideally it should take the same amount of time as the hand-coded loop.