I've parsed a big corpus and I've saved the data I needed in a dictionary structure. But at the end of my code I've saved it as a .txt file 'cause I needed to manually check something. now in another part of my work I need that dictionary as my input. I wanted to know if there are other ways than just opening the text file and re-putting it as a dictionary structure. If I can just manipulate my other to keep also as it is. Is Pickle the right thing for my case? or I'm totally on a wrong way? sorry if my question is so naive ,I'm really new to python and I'm still learning it.
-
pickle is the way to go – gkusner Jul 31 '14 at 12:44
-
1`json` may be more appropriate if human readability is important. You can't "manually check something" by opening a pickle file in Notepad. – Kevin Jul 31 '14 at 12:45
-
@json no no I did it , I mean I've parsed the corpus and now I have my dictionary as text file, once I have it, it's enough. I can keep it, and now maybe I can re-write my other code to parse one other time and keep the dictionary as it is for other parts of my work . I don't know if I'm clear or not , I mean the manually check stuff was just once and I've got what I wanted. – Pari Jul 31 '14 at 12:48
-
I just don't want to read another time as input a text file and convert it to dictionary . 'casue it seems complicated to me . – Pari Jul 31 '14 at 12:51
-
What do you mean by "now I have my dictionary as text file"? How did you save the dictionary? – Jul 31 '14 at 12:55
-
@Tichodroma as a .txt file . now I want manipulate my code and make it save the dictionary as it is . – Pari Jul 31 '14 at 13:00
-
There is no standard way to save a dictionary as a text file in Python. So *how* did you do it? – Jul 31 '14 at 13:06
1 Answers
Copy & pasting from Pickle or json? for the ease of reading.
If you do not have any interoperability requirements (i.e. you're just going to use the data with Python), and a binary format is fine, go with cPickle, which gives you really fast Python object serialization.
If you want interoperability, or you want a text format to store your data, go with JSON (or some other appropriate format depending on your constraints).
According to the above, I guess you would like cPickle
over json
However, another article I found that is interesting: http://kovshenin.com/2010/pickle-vs-json-which-is-faster/, which proves that json
is a lot faster than pickle
(the author states in the article that cPickle
is faster than pickle
but stil slower than json
)
This SO answer What is faster - Loading a pickled dictionary object or Loading a JSON file - to a dictionary? compares 6 different libraries.
- pickle
- cPickle
- json
- simplejson
- usjon
- yajl
In addition, if you use pypy, json
can be really fast.
Finally, some very recently profiling data https://gist.github.com/schlamar/3134391.
-
Thanks for your answer .Actually I don't mind if it's fast or not, I'm trying to find a more clear way , which is easier for a beginner to work with. now I'm going to review what you have suggested. – Pari Jul 31 '14 at 13:05
-
2Wish you the best of luck @Pari ! P.S. I would just go with json since it's generally available, human readable and quite performant. – pochen Jul 31 '14 at 13:07