I am running a numerical experiment that requires many iterations. After each iteration, I would like to store the data in a pickle file or pickle-like file in case the program times-out or a data structure becomes tapped. What is the best way to proceed. Here is the skeleton code:
data_dict = {} # maybe a dictionary is not the best choice
for j in parameters: # j = (alpha, beta, gamma) and cycle through
for k in number_of_experiments: # lots of experiments (10^4)
file = open('storage.pkl', 'ab')
data = experiment() # experiment returns some numerical value
# experiment takes ~ 1 seconds, but increase
# as parameters scale
data_dict.setdefault(j, []).append(data)
pickle.dump(data_dict, file)
file.close()
Questions:
- Is shelve a better choice here? Or some other python library that I am not aware?
- I am using data dict because it's easier to code and more flexible if I need to change things as I do more experiments. Would it be a huge advantage to use a pre-allocated array?
- Does opening and closing files affect run time? I do this so that I can check on the progress in addition to the text logs I have set up.
Thank you for all your help!