12

I'm struggling to append a list in a pickled file. This is the code:

#saving high scores to a pickled file

import pickle

first_name = input("Please enter your name:")
score = input("Please enter your score:")

scores = []
high_scores = first_name, score
scores.append(high_scores)

file = open("high_scores.dat", "ab")
pickle.dump(scores, file)
file.close()

file = open("high_scores.dat", "rb")
scores = pickle.load(file)
print(scores)
file.close()

The first time I run the code, it prints the name and score.

The second time I run the code, it prints the 2 names and 2 scores.

The third time I run the code, it prints the first name and score, but it overwrites the second name and score with the third name and score I entered. I just want it to keep adding the names and scores. I don't understand why it is saving the first name and overwriting the second one.

Alexander O'Mara
  • 58,688
  • 18
  • 163
  • 171
Charlie
  • 123
  • 1
  • 1
  • 7

5 Answers5

13

If you want to write and read to the pickled file, you can call dump multiple times for each entry in your list. Each time you dump, you append a score to the pickled file, and each time you load you read the next score.

>>> import pickle as dill
>>> 
>>> scores = [('joe', 1), ('bill', 2), ('betty', 100)]
>>> nscores = len(scores)
>>> 
>>> with open('high.pkl', 'ab') as f:
…   _ = [dill.dump(score, f) for score in scores]
... 
>>> 
>>> with open('high.pkl', 'ab') as f:
...   dill.dump(('mary', 1000), f)
... 
>>> # we added a score on the fly, so load nscores+1
>>> with open('high.pkl', 'rb') as f:
...     _scores = [dill.load(f) for i in range(nscores + 1)]
... 
>>> _scores
[('joe', 1), ('bill', 2), ('betty', 100), ('mary', 1000)]
>>>

The reason your code was failing most likely is that you are replacing the original scores with the unpickled list of scores. So if there were any new scores added, you'd blow them away in memory.

>>> scores
[('joe', 1), ('bill', 2), ('betty', 100)]
>>> f = open('high.pkl', 'wb')
>>> dill.dump(scores, f)
>>> f.close()
>>> 
>>> scores.append(('mary',1000))
>>> scores
[('joe', 1), ('bill', 2), ('betty', 100), ('mary', 1000)]
>>> 
>>> f = open('high.pkl', 'rb')
>>> _scores = dill.load(f)
>>> f.close()
>>> _scores
[('joe', 1), ('bill', 2), ('betty', 100)]
>>> blow away the old scores list, by pointing to _scores
>>> scores = _scores
>>> scores
[('joe', 1), ('bill', 2), ('betty', 100)]

So it's more of a python name reference issue for scores, than it is a pickle issue. Pickle is just instantiating a new list and calling it scores (in your case), and then it garbage collects whatever thing scores was pointed to before that.

>>> scores = 1
>>> f = open('high.pkl', 'rb')
>>> scores = dill.load(f)
>>> f.close()
>>> scores
[('joe', 1), ('bill', 2), ('betty', 100)]
Mike McKerns
  • 33,715
  • 8
  • 119
  • 139
  • Thanks also for your input. dogwynn's solution worked very well but will take on board what you've said. – Charlie Jan 22 '15 at 05:29
  • WHy is pickle dill here? – Trect Nov 02 '18 at 09:11
  • 1
    @DheerajMPai: because I use `dill` instead of `pickle`, so that's what I used for my solution. `pickle` also works, so instead of editing my code, I just changed the import. It's all picking... `dill` is just better at it. – Mike McKerns Nov 03 '18 at 18:06
12

You need to pull the list from your database (i.e. your pickle file) first before appending to it.

import pickle
import os

high_scores_filename = 'high_scores.dat'

scores = []

# first time you run this, "high_scores.dat" won't exist
#   so we need to check for its existence before we load 
#   our "database"
if os.path.exists(high_scores_filename):
    # "with" statements are very handy for opening files. 
    with open(high_scores_filename,'rb') as rfp: 
        scores = pickle.load(rfp)
    # Notice that there's no "rfp.close()"
    #   ... the "with" clause calls close() automatically! 

first_name = input("Please enter your name:")
score = input("Please enter your score:")

high_scores = first_name, score
scores.append(high_scores)

# Now we "sync" our database
with open(high_scores_filename,'wb') as wfp:
    pickle.dump(scores, wfp)

# Re-load our database
with open(high_scores_filename,'rb') as rfp:
    scores = pickle.load(rfp)

print(scores)
dogwynn
  • 408
  • 3
  • 13
  • Thank you also for the extra explanation you have provided with the code. – Charlie Jan 21 '15 at 23:02
  • @dogwynn: Why check if the file exists if you just load to a new variable, and avoid the real issue? – Mike McKerns Jan 21 '15 at 23:15
  • @MikeMcKerns: I guess I don't know how to call pickle.load from a non-existent file. I know you can do this with DBv2 API modules (e.g. shelve), but I'm not aware of a file-like object (which pickle.load requires) that can be opened in a "create" mode (without creating one). Otherwise, my goal was to have a script that can be executed repeatedly from the command line, each time adding a new (name,score) tuple. – dogwynn Jan 22 '15 at 00:06
  • @Charlie: My pleasure. Always good to see another Python hacker on the books. :-) – dogwynn Jan 22 '15 at 00:10
  • 1
    @dogwynn: mode `ab` instead of `wb` will create a file if one doesn't exist, and append if one does. Besides, I wasn't suggesting you should try to `load` from a nonexistent file -- merely pointing out that the OP's issue is more generally one of unpicking into an object that has the same name as the existing list -- and thus destroying any edits to the existing list that were made after the `dump`. – Mike McKerns Jan 22 '15 at 01:09
  • @MikeMcKerns I'm not sure how 'ab' is any better than 'wb'. They both create a file if none exists, and you can't append to a pickled object/list the way you would append to a file. The object has to be unpickled, appended to, and re-pickled. The OP's problem was that they were "running the code" (i.e. the whole set of code as a script), and it wasn't behaving correctly. That was because they weren't loading the extant scores before appending new scores. I addressed that. – dogwynn Jan 23 '15 at 13:05
  • @M.D.P Not sure why. That error is usually associated with reading a file object open for writing. [See this answer](https://stackoverflow.com/a/44901902/1869370) for more details. – dogwynn Dec 21 '18 at 18:48
0

Doesnt actually answer the question, but if anyone would like to add a single item at a time to a pickle, you can do it by...

import pickle
import os

high_scores_filename = '/home/ubuntu-dev/Desktop/delete/high_scores.dat'

scores = []

# first time you run this, "high_scores.dat" won't exist
#   so we need to check for its existence before we load
#   our "database"
if os.path.exists(high_scores_filename):
    # "with" statements are very handy for opening files.
    with open(high_scores_filename,'rb') as rfp:
        scores = pickle.load(rfp)
    # Notice that there's no "rfp.close()"
    #   ... the "with" clause calls close() automatically!

names = ["mike", "bob", "joe"]

for name in names:
    high_score = name
    print(name)
    scores.append(high_score)

# Now we "sync" our database
with open(high_scores_filename,'wb') as wfp:
    pickle.dump(scores, wfp)

# Re-load our database
with open(high_scores_filename,'rb') as rfp:
    scores = pickle.load(rfp)

print(scores)
CENTURION
  • 355
  • 3
  • 11
0

Don't use pickle but use h5py which also solves your purpose

with h5py.File('.\PreprocessedData.h5', 'a') as hf:
    hf["X_train"].resize((hf["X_train"].shape[0] + X_train_data.shape[0]), axis = 0)
    hf["X_train"][-X_train_data.shape[0]:] = X_train_data

    hf["X_test"].resize((hf["X_test"].shape[0] + X_test_data.shape[0]), axis = 0)
    hf["X_test"][-X_test_data.shape[0]:] = X_test_data


    hf["Y_train"].resize((hf["Y_train"].shape[0] + Y_train_data.shape[0]), axis = 0)
    hf["Y_train"][-Y_train_data.shape[0]:] = Y_train_data

    hf["Y_test"].resize((hf["Y_test"].shape[0] + Y_test_data.shape[0]), axis = 0)
    hf["Y_test"][-Y_test_data.shape[0]:] = Y_test_data

source

Trect
  • 2,759
  • 2
  • 30
  • 35
  • 2
    Why not use `hickle` or `klepto` instead? They are both built to give you a simple `dump` and `load` pickle-equivalent syntax for HDF5. If you replace `dill` in my answer with `hickle`, I believe it should work, and store as HDF5. – Mike McKerns Nov 03 '18 at 18:12
  • Yes, it works. Thanks for the info. I did not know about hickle. – Trect Nov 04 '18 at 18:01
0

Code:

*

import pickle


#adding an empty list to the file
score=[]
with open("high_scores.dat","wb") as f:
    pickle.dump(score,f)


def adding_new_score(f_name,score):
    newScore=(f_name,score)

 #reading the file
    with open("high_scores.dat","rb") as f:

 #storing that in a list
        new_score=pickle.load(f)

 #appending to that list
        new_score.append(newScore)

 #again writing the appended list to the file
    with open("high_scores.dat","wb") as f:
        pickle.dump(new_score,f)
 #reading all the scores
    with open("high_scores.dat","rb") as f:
        print(pickle.load(f))

` #whenever a new score comes first_name = input("Please enter your name:") score = input("Please enter your score:") adding_new_score(first_name, score)

first_name = input("Please enter your name:")
score = input("Please enter your score:")
adding_new_score(first_name, score)

Output:

Please enter your name:p
Please enter your score:94
[('p', '94')]
Please enter your name:s
Please enter your score:24
[('p', '94'), ('s', '24')]

You can put the adding_new_score inside a loop, so that every score will be added to the file.

If you are thinking about re-run, if you re-run the program file will became empty again. Then simply remove

score=[]
with open("high_scores.dat","wb") as f:
    pickle.dump(score,f)

when you run it 2nd time and further.