Here's the scenario, I have to run clustering algorithm over 10000 data points. I have precomputed the distances between the data points and stored them in a file. Since Python is slow in I/O intensive tasks, I am writing this clustering algorithm in C++. The main issue is that the clustering algorithm will run several times and I have to switch between the python code and C++ code. Something like this
Read Distances from text_file (C++)
Run Clustering Algorithm (C++)
Use the result of this algorithm in main python code
Run clustering algorithm again (C++)
Now I don't want to read the distance file again and again, as it already takes around 17 seconds and the file has over 500 million entries. Something like pausing the execution of C++ code and running the code again when needed. So, how could this be achieved??