why is numpy returning a 1-d array here?

Question

I'm trying to read a numpy array from a big file of binary data. Each binary record contains 7330 floats, followed by a long I want to ignore, and then an int. I create a dtype as follows:

dt = [(str(n),'f4') for n in range(7330)]
dt += [('junk','i8'), ('label','i4')]

and then read the file via

d = np.fromfile(file_name,dtype=np.dtype(dt))

It works, but I get back a one-dimensional array or records instead of the 2-D array I want. Somewhat more specifically, I get back an array with d.shape=(58134,) d[0] of type numpy.void and len(d[0])=7332 (7330 floats, the long I will ignore, and the int). I want an array of shape (58134,7332).

I can't d.reshape(-1,7332) because d is one dimensional, and I wind up converting it via the ugly and somewhat absurd

nparray = pd.DataFrame.from_records(d).to_numpy()

which seems just ridiculous. What am I doing wrong? Thanks!

Please provide a sample of the 1D array you get and what it should look like instead so that we cn better understand the problem — G. Anderson, Sep 03 '20 at 21:08
what is the difference between a 1-D structured array `(n, )` and a two-dimensional `(n, 1)` array of records except for the redundant axis? Seems like the sane thing to do, if you really want `(n, 1)` then just reshape it. — juanpa.arrivillaga, Sep 03 '20 at 21:11
`dtype` is the "type" of a single "element". Each "element" in your array has the information of 7330 floats, a long and an int, but it is still just an "element" of the resulting structured array. — darcamo, Sep 03 '20 at 21:24
@darcamo: That's exactly right, of course. I guess my question should have been "how do I make this a 2-D array" instead of "why is this a 1-D array". :) — Matt Ginsberg, Sep 03 '20 at 21:26
Maybe something like this question https://stackoverflow.com/questions/5957380/convert-structured-array-to-regular-numpy-array — darcamo, Sep 03 '20 at 21:31
`dt = [('data', 'f4', 7330), ('junk','i8'), ('label','i4')]` might also be useful. It will create 3 fields. `arr['data']` should then be the desired 2d array of floats. — hpaulj, Sep 03 '20 at 23:47

score 0 · Answer 1 · answered Sep 03 '20 at 21:48

0

Turns out that numpy.lib.recfunctions.structured_to_unstructured does exactly this. Thanks to darcamo for pointing me in that direction.

answered Sep 03 '20 at 21:48

Matt Ginsberg

37
4

why is numpy returning a 1-d array here?

1 Answers1