2

My python code needs to save the state of one object while exiting, so that the run can be resumed later on using this object. I am doing this using pickle. Below is the pickle part of my code:

#At the begining of the file
pickle_file = '.state_pickle.obj'
#if pickle file is present load object, else create new
if os.path.exists(pickle_file):
    with open(pickle_file, 'r') as fh:
        state = pickle.load(fh)
else:
    state = NewState() #NewState is a class
....
....
#At the end of the file
with open(pickle_file, 'w') as fh:
    pickle.dump(state, fh)

# When -stop option is given, then the session is stopped
# The pickle file is deleted at that time
if args.stop:
    if os.path.exists(pickle_file):
        os.remove(pickle_file)
    ...

This works fine for me. However, my problem occurs when multiple sessions are opened from the same directory. The pickle_file ('.state_pickle.obj') file is getting overwritten causing erroneous results. Is there a way to save the obj to a unique filename so that the file can be read when the session is resumed. Also, i need to get the state object even before parsing the args. So, I cannot pass the filename through args. Is there any other clean solution for this problem?

Alexander
  • 105,104
  • 32
  • 201
  • 196
allDoubts
  • 67
  • 8
  • This might not be the cause for the issue, but you should close the file. Consider using the `with` statement: `with open(pickle_file, 'w') as fh: ...` – j-i-l Jul 29 '15 at 14:08
  • Thanks jojo. I will use the with statement to make it more robhust. However, the problem here is that the same file is used for saving different objects ( of different sessions). This is creating an error. How can I uniquely name the file name so that it does not create a clash in file name for different sessions. Yet, when the session is resumed, the program should know to look into the correct file. – allDoubts Jul 29 '15 at 14:11
  • I'm not sure what is the best solution for this, but you could add an unit timestamp to the file's name or put a lock on the file, see [here](https://github.com/ilastik/lazyflow/blob/master/lazyflow/utility/fileLock.py) for further details. – j-i-l Jul 29 '15 at 14:18
  • Thanks a lot! I can keep a timestamp in the filename, but when the session is resumed, how will the program know to look for that file, since it would not have any idea about the timestamp of the file. Also, if i lock a file, i will not be able to allow multiple sessions--which is not an option for me. – allDoubts Jul 29 '15 at 14:47
  • @allDoubts see my answer – Geeocode Jul 29 '15 at 14:52

2 Answers2

1

You could define an id attribute for your NewState class then use an id creating method looking somehow like this:

import glob
def get_new_id(source_folder):
    new_id = 0
    for a_file in glob.glob(source_folder):
        # assuming a pickle file looks like 'some/path/13.obj' 
        # with 'some/path' == source_folder and 13 some id of a state.
        an_id = int(a_file.split('/')[-1].replace('.obj', ''))
        if an_id >= new_id:
            new_id = an_id + 1
    return new_id

You will then have to know the id of the State you want to resume or just take the last state:

# get the last state:
last_id = get_new_id(path_to_pickles)
pickle_file = os.path.join(path_to_pickles, '%s.obj' % str(last_id))
if os.path.exists(pickle_file):
    with open(pickle_file, 'r') as fh:
        state = pickle.load(fh)
else:
    state = NewState() #NewState is a class
    state.id = last_id

Then at the end:

pickle_file = os.path.join(path_to_pickles, '%s.obj' % str(state.id))
with open(pickle_file, 'w') as fh:
    pickle.dump(state, fh)

# When -stop option is given, then the session is stopped
# The pickle file is deleted at that time
pickle_file = os.path.join(path_to_pickles, '%s.obj' % str(state.id))
if args.stop:
    if os.path.exists(pickle_file):
        os.remove(pickle_file)
    ...

PS.

  • I'd probably define the pickle_file as another attribute of the NewState class.
  • Ideally you would handle the various processes using multiprocessing. See here for an example.
Community
  • 1
  • 1
j-i-l
  • 10,281
  • 3
  • 53
  • 70
0

Save the pickle file to ascending numbered filename. Then when you start a modul, lock the file with the smallest number, as you know that that is the first process you have to continue. Don't release the lock while the process goes on. So the other process has not access to this file, will search the next one.

Geeocode
  • 5,705
  • 3
  • 20
  • 34