I am working on a machine with an Nvidia GPU and Cuda8 and I have a C++ application that should compute the L1 distance between two vectors that are represented by std::vector<double>
.
Currently, my code is not parallel at all and only uses the CPU:
double compute_l1_distance(const std::vector<double> &v1, const std::vector<double> &v2) {
if (v1.size() != v2.size()) {
return -1;
}
double result = 0;
for (int i = 0 ; i < v1.size() i++) {
double val = v1[i] - v2[i];
if (val < 0) {
val = 0 - val;
}
result += val;
}
return result;
}
How can I improve the performance of this computation? How can I utilize the GPU? Are there recommended libraries that will do the job fast using the GPU or using any other optimization?