305

I'm used to bringing data in and out of Python using CSV files, but there are obvious challenges to this. Are there simple ways to store a dictionary (or sets of dictionaries) in a JSON or pickle file?

For example:

data = {}
data ['key1'] = "keyinfo"
data ['key2'] = "keyinfo2"

I would like to know both how to save this, and then how to load it back in.

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
mike
  • 22,931
  • 31
  • 77
  • 100
  • 8
    Have you read the documentation for the [json](http://docs.python.org/library/json.html) or [pickle](http://docs.python.org/library/pickle.html) standard modules? – Greg Hewgill Aug 17 '11 at 22:09
  • 1
    See [Save a dictionary to a file (alternative to pickle) in Python?](http://stackoverflow.com/q/4893689/562769) – Martin Thoma Apr 29 '16 at 09:07

10 Answers10

612

Pickle save:

try:
    import cPickle as pickle
except ImportError:  # Python 3.x
    import pickle

with open('data.p', 'wb') as fp:
    pickle.dump(data, fp, protocol=pickle.HIGHEST_PROTOCOL)

See the pickle module documentation for additional information regarding the protocol argument.

Pickle load:

with open('data.p', 'rb') as fp:
    data = pickle.load(fp)

JSON save:

import json

with open('data.json', 'w') as fp:
    json.dump(data, fp)

Supply extra arguments, like sort_keys or indent, to get a pretty result. The argument sort_keys will sort the keys alphabetically and indent will indent your data structure with indent=N spaces.

json.dump(data, fp, sort_keys=True, indent=4)

JSON load:

with open('data.json', 'r') as fp:
    data = json.load(fp)
Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Marty
  • 7,920
  • 1
  • 20
  • 10
  • 8
    JSON does dictionaries natively (though they obviously don't behave exactly as a python dictionary does while in memory, for persistence purposes, they are identical). In fact, the foundational unit in json is the "Object", which is defined as { : }. Look familiar? The json module in the standard library supports every Python native type and can easily be extended with a minimal knowledge of json to support user-defined classes. [The JSON homepage](http://www.json.org/) completely defines the language in just over 3 printed pages, so it's easy to absorb/digest quickly. – Jonathanb Aug 17 '11 at 23:47
  • 2
    It's worth knowing about the third argument to `pickle.dump`, too. If the file doesn't need to be human-readable then it can speed things up a lot. – Steve Jessop Aug 18 '11 at 00:11
  • Seems like this doesn't work out-of-the-box with 3.x. Which part has to be casted? – Dror Feb 16 '15 at 12:23
  • Why are you using binary mode for the JSON files here? JSON is a text format. – Martijn Pieters Feb 16 '15 at 12:51
  • @Dror: The `json` library produces text (type `str`) data, so the files should not be opened in binary mode. – Martijn Pieters Feb 16 '15 at 12:52
  • @MartijnPieters, and Dror: I'm fairly certain that was a poorly executed copy and paste. Fixed in the most recent edit, thanks for pointing out the problem. – Marty May 23 '15 at 06:16
  • 16
    If you add **sort_keys** and **indent** arguments to the dump call you get a much prettier result. e.g.: `json.dump(data, fp, sort_keys=True, indent=4)`. More info can be found [here](https://docs.python.org/2/library/json.html) – juliusmh Mar 10 '16 at 13:31
  • 1
    You should probably use `pickle.dump(data, fp, protocol=pickle.HIGHEST_PROTOCOL)` – Martin Thoma Apr 29 '16 at 08:59
  • @Marty - Is there any advantage of using `pickle` vs `json`? `json` is a readable file instead of a byte stream; size of resulting file, etc? – Matteo Sep 09 '16 at 17:10
  • @Matteo Json won't work with dict that have tuples as keys { (1,2): "abc" }, but they are readable by people. Pickle will save dicts that have tuples as keys, but they are not as readble by people as Json. – dvdhns Oct 27 '16 at 21:13
  • 2
    For python 3, use `import pickle` – Melroy van den Berg Aug 15 '17 at 19:21
  • Is it only me or is saving the json really slow? Trying to store approx 100mb json. – WJA Jul 23 '19 at 12:49
  • There are multiple ways of importing a dictionary. Importing variable from different file, using json, and using pickle. What are the differences? – haneulkim Dec 17 '19 at 01:59
  • I've tried to check which is better in memory usage, for me pickle was 5 kb and json was 15 kb. If it helps anyone. – Alon Samuel Jan 03 '20 at 11:44
  • Wrt to pickle, it's a pretty sad state of affairs that the word `security` does not appear anywhere in this page. https://security.stackexchange.com/questions/183966/safely-load-a-pickle-file – JL Peyret Mar 24 '20 at 21:11
  • note : You cannot use the json method for saving dictionary object multiple times, since `w` is used for opening the file and it will overwrite the contents with new content every time those lines are executed (same goes for pickle i guess, didn't really tried that one) – ansh sachdeva Jun 12 '20 at 19:09
  • @VineeshTP maybe [this](https://stackoverflow.com/a/62351352/7500651) will help – ansh sachdeva Jun 12 '20 at 19:44
  • @anshsachdeva JSON.stringify(result_dict, null, 2); '2' worked for me. – Vineesh TP Jun 14 '20 at 04:02
50

Minimal example, writing directly to a file:

import json
json.dump(data, open(filename, 'wb'))
data = json.load(open(filename))

or safely opening / closing:

import json
with open(filename, 'wb') as outfile:
    json.dump(data, outfile)
with open(filename) as infile:
    data = json.load(infile)

If you want to save it in a string instead of a file:

import json
json_str = json.dumps(data)
data = json.loads(json_str)
Alexander
  • 23,432
  • 11
  • 63
  • 73
agf
  • 171,228
  • 44
  • 289
  • 238
9

Also see the speeded-up package ujson:

import ujson

with open('data.json', 'wb') as fp:
    ujson.dump(data, fp)
Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Elliott
  • 1,331
  • 12
  • 12
7

To write to a file:

import json
myfile.write(json.dumps(mydict))

To read from a file:

import json
mydict = json.loads(myfile.read())

myfile is the file object for the file that you stored the dict in.

Rafe Kettler
  • 75,757
  • 21
  • 156
  • 151
5

If you want an alternative to pickle or json, you can use klepto.

>>> init = {'y': 2, 'x': 1, 'z': 3}
>>> import klepto
>>> cache = klepto.archives.file_archive('memo', init, serialized=False)
>>> cache        
{'y': 2, 'x': 1, 'z': 3}
>>>
>>> # dump dictionary to the file 'memo.py'
>>> cache.dump() 
>>> 
>>> # import from 'memo.py'
>>> from memo import memo
>>> print memo
{'y': 2, 'x': 1, 'z': 3}

With klepto, if you had used serialized=True, the dictionary would have been written to memo.pkl as a pickled dictionary instead of with clear text.

You can get klepto here: https://github.com/uqfoundation/klepto

dill is probably a better choice for pickling then pickle itself, as dill can serialize almost anything in python. klepto also can use dill.

You can get dill here: https://github.com/uqfoundation/dill

The additional mumbo-jumbo on the first few lines are because klepto can be configured to store dictionaries to a file, to a directory context, or to a SQL database. The API is the same for whatever you choose as the backend archive. It gives you an "archivable" dictionary with which you can use load and dump to interact with the archive.

Mike McKerns
  • 33,715
  • 8
  • 119
  • 139
5

If you're after serialization, but won't need the data in other programs, I strongly recommend the shelve module. Think of it as a persistent dictionary.

myData = shelve.open('/path/to/file')

# Check for values.
keyVar in myData

# Set values
myData[anotherKey] = someValue

# Save the data for future use.
myData.close()
Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
g.d.d.c
  • 46,865
  • 9
  • 101
  • 111
  • 2
    If you want to store a whole dict, or load a whole dict, `json` is more convenient. `shelve` is only better for accessing one key at a time. – agf Aug 17 '11 at 22:15
4

For completeness, we should include ConfigParser and configparser which are part of the standard library in Python 2 and 3, respectively. This module reads and writes to a config/ini file and (at least in Python 3) behaves in a lot of ways like a dictionary. It has the added benefit that you can store multiple dictionaries into separate sections of your config/ini file and recall them. Sweet!

Python 2.7.x example.

import ConfigParser

config = ConfigParser.ConfigParser()

dict1 = {'key1':'keyinfo', 'key2':'keyinfo2'}
dict2 = {'k1':'hot', 'k2':'cross', 'k3':'buns'}
dict3 = {'x':1, 'y':2, 'z':3}

# Make each dictionary a separate section in the configuration
config.add_section('dict1')
for key in dict1.keys():
    config.set('dict1', key, dict1[key])
   
config.add_section('dict2')
for key in dict2.keys():
    config.set('dict2', key, dict2[key])

config.add_section('dict3')
for key in dict3.keys():
    config.set('dict3', key, dict3[key])

# Save the configuration to a file
f = open('config.ini', 'w')
config.write(f)
f.close()

# Read the configuration from a file
config2 = ConfigParser.ConfigParser()
config2.read('config.ini')

dictA = {}
for item in config2.items('dict1'):
    dictA[item[0]] = item[1]

dictB = {}
for item in config2.items('dict2'):
    dictB[item[0]] = item[1]

dictC = {}
for item in config2.items('dict3'):
    dictC[item[0]] = item[1]

print(dictA)
print(dictB)
print(dictC)

Python 3.X example.

import configparser

config = configparser.ConfigParser()

dict1 = {'key1':'keyinfo', 'key2':'keyinfo2'}
dict2 = {'k1':'hot', 'k2':'cross', 'k3':'buns'}
dict3 = {'x':1, 'y':2, 'z':3}

# Make each dictionary a separate section in the configuration
config['dict1'] = dict1
config['dict2'] = dict2
config['dict3'] = dict3

# Save the configuration to a file
f = open('config.ini', 'w')
config.write(f)
f.close()

# Read the configuration from a file
config2 = configparser.ConfigParser()
config2.read('config.ini')

# ConfigParser objects are a lot like dictionaries, but if you really
# want a dictionary you can ask it to convert a section to a dictionary
dictA = dict(config2['dict1'] )
dictB = dict(config2['dict2'] )
dictC = dict(config2['dict3'])

print(dictA)
print(dictB)
print(dictC)

Console output

{'key2': 'keyinfo2', 'key1': 'keyinfo'}
{'k1': 'hot', 'k2': 'cross', 'k3': 'buns'}
{'z': '3', 'y': '2', 'x': '1'}

Contents of config.ini

[dict1]
key2 = keyinfo2
key1 = keyinfo

[dict2]
k1 = hot
k2 = cross
k3 = buns

[dict3]
z = 3
y = 2
x = 1
Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
bfris
  • 5,272
  • 1
  • 20
  • 37
3

If save to a JSON file, the best and easiest way of doing this is:

import json
with open("file.json", "wb") as f:
    f.write(json.dumps(dict).encode("utf-8"))
Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Adam Liu
  • 1,288
  • 13
  • 17
  • why is this easier than `json.dump( )` as outlined in the other answer? – baxx Apr 20 '20 at 10:14
  • @baxx The answer for your question probably is file writing mode (ref: https://stackoverflow.com/a/33860227/5595995) – Cloud Cho Apr 22 '23 at 00:10
0

My use case was to save multiple JSON objects to a file and marty's answer helped me somewhat. But to serve my use case, the answer was not complete as it would overwrite the old data every time a new entry was saved.

To save multiple entries in a file, one must check for the old content (i.e., read before write). A typical file holding JSON data will either have a list or an object as root. So I considered that my JSON file always has a list of objects and every time I add data to it, I simply load the list first, append my new data in it, and dump it back to a writable-only instance of file (w):

def saveJson(url,sc): # This function writes the two values to the file
    newdata = {'url':url,'sc':sc}
    json_path = "db/file.json"

    old_list= []
    with open(json_path) as myfile:  # Read the contents first
        old_list = json.load(myfile)
    old_list.append(newdata)

    with open(json_path,"w") as myfile:  # Overwrite the whole content
        json.dump(old_list, myfile, sort_keys=True, indent=4)

    return "success"

The new JSON file will look something like this:

[
    {
        "sc": "a11",
        "url": "www.google.com"
    },
    {
        "sc": "a12",
        "url": "www.google.com"
    },
    {
        "sc": "a13",
        "url": "www.google.com"
    }
]

NOTE: It is essential to have a file named file.json with [] as initial data for this approach to work

PS: not related to original question, but this approach could also be further improved by first checking if our entry already exists (based on one or multiple keys) and only then append and save the data.

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
ansh sachdeva
  • 1,220
  • 1
  • 15
  • 32
0

Shorter code

Saving and loading all types of python variables (incl. dictionaries) with one line of code each.

data = {'key1': 'keyinfo', 'key2': 'keyinfo2'}

saving:

pickle.dump(data, open('path/to/file/data.pickle', 'wb'))
   

loading:

data_loaded = pickle.load(open('path/to/file/data.pickle', 'rb'))

Maybe it's obvious, but I used the two-row solution in the top answer quite a while before I tried to make it shorter.

Thomas R
  • 1,067
  • 11
  • 17