I have a 50,000x5,000 matrix(float) file. when use x = np.genfromtxt(readFrom, dtype=float)
to load the file into memory, I am getting the following error message:
File "C:\Python27\lib\site-packages\numpy\lib\npyio.py", line 1583, in genfromtxt for (i, converter) in enumerate(converters)])
MemoryError
I want to load the whole file into memory because I am calculating the euclidean distance between each vectors using Scipy. dis = scipy.spatial.distance.euclidean(x[row1], x[row2])
Is there any efficient way to load a huge matrix file into memory.
Thank you.
Update:
I managed to solve the problem. Here is my solution. I am not sure whether it's efficient or logically correct but works fine for me:
x = open(readFrom, 'r').readlines()
y = np.asarray([np.array(s.split()).astype('float32') for s in x], dtype=np.float32)
....
dis = scipy.spatial.distance.euclidean(y[row1], y[row2])
Please help me to improve my solution.