3

I am trying to put many numpy files to get one big numpy file, I tried to follow this link Python append multiple files in given order to one big file and this is what I did:

import matplotlib.pyplot as plt 
import numpy as np
import os, sys

#Read in list of files. You might want to look into os.listdir()

path= "/home/user/Desktop/ALLMyTraces.npy/test"
#Test folder contains all my numpy file traces
traces= os.listdir(path)

# Create new File
f = open("/home/user/Desktop/ALLMyTraces.npy", "w")

for j,trace in enumerate(traces):

    # Find the path of the file
    filepath = os.path.join(path, trace)

    # Load file
    dataArray= np.load(filepath)
    f.write(dataArray)

File is created, and to verify that I have the good contents, I used this code:

import numpy as np
dataArray= np.load(r'/home/user/Desktop/ALLMyTraces.npy')
print(dataArray)

This error is produced as a result:

 dataArray= np.load(r'/home/user/Desktop/ALLMyTraces.npy')
  File "/usr/lib/python2.7/dist-packages/numpy/lib/npyio.py", line 401, in load
    "Failed to interpret file %s as a pickle" % repr(file))
IOError: Failed to interpret file '/home/user/Desktop/ALLMyTraces.npy' as a pickle

I don't know really the problem. Any help would be appreciated.

Community
  • 1
  • 1
nass9801
  • 339
  • 3
  • 6
  • 14

2 Answers2

2

You should use numpy.save or numpy.savez to create pickled .npy or .npz binary files. Only those file can be read by numpy.load(). Since you are creating a text file using f.write(dataArray), np.load() is failing with the above mentioned error

Here is a sample

fpath ="path to big file"
npyfilespath ='path to nympy files to be merged '   
os.chdir(npyfilespath)

with open(fpath, 'wb') as f_handle:
    for npfile in glob.glob("*.npy"):

        # Find the path of the file
        filepath = os.path.join(path, npfile)
        print filepath
        # Load file
        dataArray= np.load(filepath)
        print dataArray
        np.save(f_handle,dataArray)
dataArray= np.load(fpath)
print dataArray

Just found that there is something really interesting in numpy load. It wont load all append arrays at once :). Read this post for more info.

This means , if you want to read all appended arrays , you need to load them multiple times.

f = open(fpath, 'rb')
dataArray= np.load(f) #loads first array
print dataArray
dataArray= np.load(f)  #loads Second array
print dataArray
dataArray= np.load(f) #loads Third array
print dataArray
Community
  • 1
  • 1
Shijo
  • 9,313
  • 3
  • 19
  • 31
  • It gives me this answer: ', mode '' at 0x7f3ed40e48a0> – nass9801 Feb 10 '17 at 14:18
  • I dont see any issue – Shijo Feb 10 '17 at 15:01
  • Thank you very much for your help, but in this cas the file saves only the content of the last file. – nass9801 Feb 10 '17 at 15:25
  • What do you mean by that ? it should be saving the data from all npy files in that directory – Shijo Feb 10 '17 at 15:28
  • I don't understand those tow files : with open(fpath, 'wb') as f_handle: for npfile in glob.glob("*.npy"): do you mean by the fpath: the path of the big file ?? – nass9801 Feb 10 '17 at 15:37
  • Dear @Shijo, after updating my code, the big file just put for me the laste trace. Ps: I am sorry for the late answer , i was so ill alll the weekend. – nass9801 Feb 13 '17 at 11:32
  • http://stackoverflow.com/questions/42204368/how-to-append-many-numpy-files-into-one-numpy-file-in-python, I had already put it in this question, I didn't think that you will answer me, thank you very much, you are very kind really, thank you – nass9801 Feb 13 '17 at 13:59
  • Dear @Shijo , I had already putmy code in this question http://stackoverflow.com/questions/42204368/how-to-append-many-numpy-files-into-one-numpy-file-in-python, I didn't think that you will answer me, thank you very much, you are very kind really, thank you – nass9801 Feb 13 '17 at 14:15
0

np.save writes a metablock (with shape and dtype kinds of information) and a data block. np.load is the complement, capable of reading that format.

dataArray= np.load(filepath)

dataArray is now an array in memory, just like the original array. It is not a direct image of the filepath contents.

f.write(dataArray)

I'm not even sure what this writes to file; it certainly isn't save/load compatible. So your general Python file link is sending you off in the wrong direction.

There are two straight forward ways of saving multiple arrays in one file:

  • concatentate the arrays into one larger array, and save that. This requires dimensional compatibility.

  • np.savez saves multiple arrays to a zip archive. That is each array is saved to an npy file, and those are all collected into the archive. np.load is capable of loading such an npz archive (but read what it says about lazy load).

There are some variations on this. You could roll your own archive with outside tools. You could also create complex object type arrays, which np.save/load can handle. It reverts to pickle if elements can't be saved in the ordinary manner. In fact you could use pickle to save multiple arrays.

There have also been SO discussions about repeatedly using np.save to the same file. That's not hard to do, but doing the load is trickier. https://stackoverflow.com/a/35752728/901925

hpaulj
  • 221,503
  • 14
  • 230
  • 353