I'm trying to read a numpy array from a big file of binary data. Each binary record contains 7330 floats, followed by a long I want to ignore, and then an int. I create a dtype as follows:
dt = [(str(n),'f4') for n in range(7330)]
dt += [('junk','i8'), ('label','i4')]
and then read the file via
d = np.fromfile(file_name,dtype=np.dtype(dt))
It works, but I get back a one-dimensional array or records instead of the 2-D array I want. Somewhat more specifically, I get back an array with d.shape=(58134,)
d[0]
of type numpy.void
and len(d[0])=7332
(7330 floats, the long I will ignore, and the int). I want an array of shape (58134,7332)
.
I can't d.reshape(-1,7332) because d is one dimensional, and I wind up converting it via the ugly and somewhat absurd
nparray = pd.DataFrame.from_records(d).to_numpy()
which seems just ridiculous. What am I doing wrong? Thanks!