1

This is not a duplicate question. I looked around a lot and found this question, but the savezand pickle utilities render the file unreadable by a human. I want to save it in a .txt file which can be loaded back into a python script. So I wanted to know whether there are some utilities in python which can facilitate this task and keep the written file readable by a human.

The dictionary of numpy arrays contains 2D arrays.

EDIT:
According to Craig's answer, I tried the following :

import numpy as np 

W = np.arange(10).reshape(2,5)
b = np.arange(12).reshape(3,4)
d = {'W':W, 'b':b}
with open('out.txt', 'w') as outfile:
    outfile.write(repr(d))

f = open('out.txt', 'r')
d = eval(f.readline())

print(d) 

This gave the following error: SyntaxError: unexpected EOF while parsing.
But the out.txtdid contain the dictionary as expected. How can I load it correctly?

EDIT 2: Ran into a problem : Craig's answer truncates the array if the size is large. The out.txt shows first few elements, replaces the middle elements by ... and shows the last few elements.

Community
  • 1
  • 1

1 Answers1

3

Convert the dict to a string using repr() and write that to the text file.

import numpy as np

d = {'a':np.zeros(10), 'b':np.ones(10)}
with open('out.txt', 'w') as outfile:
    outfile.write(repr(d))

You can read it back in and convert to a dictionary with eval():

import numpy as np

f = open('out.txt', 'r')
data = f.read()
data = data.replace('array', 'np.array')
d = eval(data)

Or, you can directly import array from numpy:

from numpy import array

f = open('out.txt', 'r')
data = f.read()
d = eval(data)

H/T: How can a string representation of a NumPy array be converted to a NumPy array?

Handling large arrays

By default, numpy summarizes arrays longer than 1000 elements. You can change this behavior by calling numpy.set_printoptions(threshold=S) where S is larger than the size of the arrays. For example:

import numpy as np 

W = np.arange(10).reshape(2,5)
b = np.arange(12).reshape(3,4)
d = {'W':W, 'b':b}

largest = max(np.prod(a.shape) for a in d.values()) #get the size of the largest array
np.set_printoptions(threshold=largest) #set threshold to largest to avoid summarizing

with open('out.txt', 'w') as outfile:
    outfile.write(repr(d))    

np.set_printoptions(threshold=1000) #recommended, but not necessary

H/T: Ellipses when converting list of numpy arrays to string in python 3

Craig
  • 4,605
  • 1
  • 18
  • 28
  • How to load this back into another python script and retrieve the individual 2D numpy arrays? – Shraddheya Shendre Apr 14 '17 at 01:45
  • @Craig this could work, sure, and the solution with `eval` is neat. But this is exactly what you'll get with `json`, so why not just use that? – Aleksander Lidtke Apr 14 '17 at 01:51
  • I tried this, but this gave an error while loading the file. I have edited the question to show it. How to load the dictionary correctly? – Shraddheya Shendre Apr 14 '17 at 02:01
  • @ShraddheyaShendre I fixed the code to load handle numpy arrays on loading. – Craig Apr 14 '17 at 02:02
  • @AleksanderLidtke I get `TypeError: Object of type 'ndarray' is not JSON serializable` when I try using JSON. – Craig Apr 14 '17 at 02:04
  • Another option, which may be even better is to use `from numpy import array` and then you can skip the `data.replace(...)` line entirely. – Craig Apr 14 '17 at 02:11
  • Sorry for withdrawing the 'accept' (didn't withdraw the upvote though) but I ran into a problem, check the latest edit. This method is truncating the array if size is large. – Shraddheya Shendre Apr 14 '17 at 03:25
  • @ShraddheyaShendre Are you using `readlines()` or `read()`? In my first post I had `readline()`, which was wrong because it terminates the input if there is a line-wrap. – Craig Apr 14 '17 at 03:29
  • No, the problem is not with reading. The `out.txt` file contains the truncated array. So, the problem is with writing. – Shraddheya Shendre Apr 14 '17 at 03:31