My Python program was too slow. So, I profiled it and found that most of the time was being spent in a function that computes distance between two points (a point is a list of 3 Python floats):
def get_dist(pt0, pt1):
val = 0
for i in range(3):
val += (pt0[i] - pt1[i]) ** 2
val = math.sqrt(val)
return val
To analyze why this function was so slow, I wrote two test programs: one in Python and one in C++ that do similar computation. They compute the distance between 1 million pairs of points. (The test code in Python and C++ is below.)
The Python computation takes 2 seconds, while C++ takes 0.02 seconds. A 100x difference!
Why is Python code so much slower than C++ code for such simple math computations? How do I speed it up to match the C++ performance?
The Python code used for testing:
import math, random, time
num = 1000000
# Generate random points and numbers
pt_list = []
rand_list = []
for i in range(num):
pt = []
for j in range(3):
pt.append(random.random())
pt_list.append(pt)
rand_list.append(random.randint(0, num - 1))
# Compute
beg_time = time.clock()
dist = 0
for i in range(num):
pt0 = pt_list[i]
ri = rand_list[i]
pt1 = pt_list[ri]
val = 0
for j in range(3):
val += (pt0[j] - pt1[j]) ** 2
val = math.sqrt(val)
dist += val
end_time = time.clock()
elap_time = (end_time - beg_time)
print elap_time
print dist
The C++ code used for testing:
#include <cstdlib>
#include <iostream>
#include <ctime>
#include <cmath>
struct Point
{
double v[3];
};
int num = 1000000;
int main()
{
// Allocate memory
Point** pt_list = new Point*[num];
int* rand_list = new int[num];
// Generate random points and numbers
for ( int i = 0; i < num; ++i )
{
Point* pt = new Point;
for ( int j = 0; j < 3; ++j )
{
const double r = (double) rand() / (double) RAND_MAX;
pt->v[j] = r;
}
pt_list[i] = pt;
rand_list[i] = rand() % num;
}
// Compute
clock_t beg_time = clock();
double dist = 0;
for ( int i = 0; i < num; ++i )
{
const Point* pt0 = pt_list[i];
int r = rand_list[i];
const Point* pt1 = pt_list[r];
double val = 0;
for ( int j = 0; j < 3; ++j )
{
const double d = pt0->v[j] - pt1->v[j];
val += ( d * d );
}
val = sqrt(val);
dist += val;
}
clock_t end_time = clock();
double sec_time = (end_time - beg_time) / (double) CLOCKS_PER_SEC;
std::cout << sec_time << std::endl;
std::cout << dist << std::endl;
return 0;
}