0

I wish to write to a text file with a dictionary. There are three methods that I've seen and it seems that they are all valid, but I am interested in which one will be most optimized or efficient for reading/writing, especially when I have a large dictionary with many entries and why.

    new_dict = {}

    new_dict["city"] = "Boston"

    # Writing to the file by string conversion
    with open(r'C:\\Users\xy243\Documents\pop.txt', 'w') as new_file:
        new_file.write(str(new_dict)) 

    # Writing to the file using Pickle

    import pickle
    with open(r'C:\\Users\xy243\Documents\pop.txt', 'w') as new_file:
        pickle.dump(new_dict, new_file, protocol=pickle.HIGHEST_PROTOCOL)

    # Writing to the file using JSON

    import json
    with open(r'C:\\Users\xy243\Documents\pop.txt', 'w') as new_file:
        json.dump(new_dict, new_file)




user6235442
  • 136
  • 7
  • 1
    method 1 is a bit lousy, you should really care about JSON vs pickle – Evgeny Jul 02 '19 at 21:44
  • 1
    Why don't you just generate a large file, and try each method with a stopwatch? – Blorgbeard Jul 02 '19 at 21:57
  • 1
    In cases like this, you can use the [timeit](https://docs.python.org/3/library/timeit.html) built-in (and, if using ipython or jupyter, the [prun](https://stackoverflow.com/questions/7069733/how-do-i-read-the-output-of-the-ipython-prun-profiler-command) profiling magic) to test and evaluate performance for yourself based on your actual data – G. Anderson Jul 02 '19 at 21:58

2 Answers2

0

The answers about efficiency have been pretty much been covered with the comments, however, it would probably be useful to you, if your dataset is large and you might want to replicate your approach, to consider SQL alternatives, made easier in python with SQLAlchemy. That way, you can access it quickly, but store it neatly in a database.

Preto
  • 78
  • 1
  • 6
0

Objects of some python classes may not be json serializable. If your dictionary contains such objects (as values), then you can't use json object.

Sure, some objects of some python classes may not be pickle serializable (for example, keras/tensorflow objects). Then, again, you can't use pickle method.

In my opinion, classes which can't be json serialized are more than classes which can't be pickled.

That being said, pickle method may be applicable more widely than json.

Efficiency wise (considering your dictionary is both json serializable and pickle-able), pickle will always win because no string conversion is involved (number to string while serializing and string to number while deserializing).

If you are trying to transport the object to another process/server (written in another programming language especially ... Java etc.), then you have to live with json. This applies even if you write to file and another process read from that file.

So ... it depends on your use-case.

Edward Aung
  • 3,014
  • 1
  • 12
  • 15