41

Let's say I have something like this:

d = { "abc" : [1, 2, 3], "qwerty" : [4,5,6] }

What's the easiest way to progammatically get that into a file that I can load from python later?

Can I somehow save it as python source (from within a python script, not manually!), then import it later?

Or should I use JSON or something?

Blorgbeard
  • 101,031
  • 48
  • 228
  • 272
  • 1
    Here's a couple more: [dataset](https://dataset.readthedocs.org/en/latest/) and [jsonpickle](http://jsonpickle.github.io). – zekel Mar 26 '16 at 16:05
  • The easiest way would be JSON because it's structuring data similar to Python dictionary. Luckily, python has a bundled JSON module. All you need to do is just `import json`. – Willy satrio nugroho Dec 25 '20 at 03:55

7 Answers7

70

Use the pickle module.

import pickle
d = { "abc" : [1, 2, 3], "qwerty" : [4,5,6] }
afile = open(r'C:\d.pkl', 'wb')
pickle.dump(d, afile)
afile.close()

#reload object from file
file2 = open(r'C:\d.pkl', 'rb')
new_d = pickle.load(file2)
file2.close()

#print dictionary object loaded from file
print new_d
eric.christensen
  • 3,191
  • 4
  • 29
  • 35
  • 2
    What's the r in front of the path mean? – Blorgbeard Jun 26 '09 at 04:49
  • Also, that's giving me "TypeError: can't write bytes to text stream" - is it any different for Python 3.0? – Blorgbeard Jun 26 '09 at 04:51
  • 2
    The r'' denotes a raw string, described here: http://docs.python.org/reference/lexical_analysis.html#string-literals. Basically, it means that backslashes in the string are included as literal backslashes, not character escapes (though a raw string can't end in a backslash). – Miles Jun 26 '09 at 05:01
  • 1
    I've corrected the example—the file needs to be opened in binary mode. It still needs to be for Python 2, but it won't fail as dramatically. – Miles Jun 26 '09 at 05:02
  • 1
    Make sure you read the Python documentation (including for the appropriate version) and don't just rely on examples! :) http://docs.python.org/3.0/library/pickle.html (Sorry for the comment spam!) – Miles Jun 26 '09 at 05:05
  • 1
    I doubt that, since your original example didn't open afile in write mode. ;) But as for the binary mode, in Python 2, it might work (since the binary flag has basically no effect on Linux and OS X) but is non-portable and can run into trouble on Windows if the resulting file contains newline or DOS EOF characters. – Miles Jun 26 '09 at 05:22
  • 2
    Technically pickling will work for text mode files, so long as you're not using a binary pickle format (ie. protocol = 0) and you use it consistently (ie. also use text mode for reading back). Using binary is generally a better idea though, especially if you could be moving data between platforms. – Brian Jun 26 '09 at 06:49
15

Take your pick: Python Standard Library - Data Persistance. Which one is most appropriate can vary by what your specific needs are.

pickle is probably the simplest and most capable as far as "write an arbitrary object to a file and recover it" goes—it can automatically handle custom classes and circular references.

For the best pickling performance (speed and space), use cPickle at HIGHEST_PROTOCOL.

Miles
  • 31,360
  • 7
  • 64
  • 74
8

Try the shelve module which will give you persistent dictionary, for example:

import shelve
d = { "abc" : [1, 2, 3], "qwerty" : [4,5,6] }

shelf = shelve.open('shelf_file')
for key in d:
    shelf[key] = d[key]

shelf.close()

....

# reopen the shelf
shelf = shelve.open('shelf_file')
print(shelf) # => {'qwerty': [4, 5, 6], 'abc': [1, 2, 3]}
mhawke
  • 84,695
  • 9
  • 117
  • 138
5

JSON has faults, but when it meets your needs, it is also:

  • simple to use
  • included in the standard library as the json module
  • interface somewhat similar to pickle, which can handle more complex situations
  • human-editable text for debugging, sharing, and version control
  • valid Python code
  • well-established on the web (if your program touches any of that domain)
  • 6
    JSON ain't valid Python. It looks so, superficially, but use some bools and you'll see the problem (JSON uses true and false, while Python uses True and False). Also: JSON arrays (dicts) only have string keys. So it doesn't preserve the data structure correctly. – Jürgen A. Erhard Jun 17 '13 at 08:07
5

You also might want to take a look at Zope's Object Database the more complex you get:-) Probably overkill for what you have, but it scales well and is not too hard to use.

Jay Atkinson
  • 3,279
  • 2
  • 27
  • 41
3

Just to add to the previous suggestions, if you want the file format to be easily readable and modifiable, you can also use YAML. It works extremely well for nested dicts and lists, but scales for more complex data structures (i.e. ones involving custom objects) as well, and its big plus is that the format is readable.

Eli Bendersky
  • 263,248
  • 89
  • 350
  • 412
1

If you want to save it in an easy to read JSON-like format, use repr to serialize the object and eval to deserialize it.

repr(object) -> string

Return the canonical string representation of the object. For most object types, eval(repr(object)) == object.

John Kugelman
  • 349,597
  • 67
  • 533
  • 578
  • 1
    Consider ast.literal_eval() (http://docs.python.org/library/ast.html#ast.literal_eval) as an alternative to eval(). – Miles Jun 26 '09 at 04:33
  • The main thing I don't like about this solution is that you have an object in the structure where the eval(repr()) identity doesn't hold, repr() will "succeed" but then eval() will barf. – Miles Jun 26 '09 at 04:37
  • @John You will be pilioried for that answer... were's S.Lott? – mhawke Jun 26 '09 at 04:44
  • 3
    pickle, YAML, JSON, etc. are all safer and work with more types than this method. IMO, eval() should be avoided whenever possible. – Jason C Jun 26 '09 at 05:05
  • Heh I should've known to put on my asbestos pants before suggesting eval! It's a fair cop. – John Kugelman Jun 26 '09 at 05:13
  • 4
    @Jason: Actually, pickle is not any safer than eval - malicious input can execute code just as easily, and here at least it is obvious that it is doing so, so I think downvoting this is a little unfair. There are other reasons to avoid eval() (eg. only handles objects with evalable repr()s and silently loses data if they don't self-eval, as Miles pointed out), but security wise, it's no worse than pickle. – Brian Jun 26 '09 at 06:57
  • @Brian: Good point, I had not considered that. But it is the case that, of the alternatives I list, pickle and YAML work with more data types than repr()/eval(), and YAML and JSON are safer. So I still think eval() is a bad idea here. – Jason C Jun 26 '09 at 14:46