Cannot Flatten NumPy ndarray/ How to read binary file more intelligently

Question

I am writing a numpy based .PLY importer. I am only interested in binary files, and vertices, faces and vertex colors. My target data format is a flattened list of x,y,z floats for the vertex data and r,g,b,a integers for the color data.

 [x0,y0,z0,x1,y1,z1....xn,yn,zn] 
 [r0,g0,b0,a0,r1,g1,b1,a1....rn,gn,bn,an]

This allows me to use fasts builtin C++ methods to construct the mesh in the target program (Blender).

I am using a modified version of this code to read in the data into numpy arrays example

valid_formats = {'binary_big_endian': '>','binary_little_endian': '<'}
ply = open(filename, 'rb')
# get binary_little/big or ascii
fmt = ply.readline().split()[1].decode()
# get extension for building the numpy dtypes
ext = valid_formats[fmt]
ply.seek(end_header)
#v_dtype = [('x','<f4'),('y','<f4'), ('z','<f4'), ('red','<u1'), ('green','<u1'), ('blue','<u1'),('alpha','<u1')]
#points_size = (previously read in from header)
points_np = np.fromfile(ply, dtype=v_dtype, count=points_size)

The results being

print(points_np.shape)
print(points_np[0:3])
print(points_np.ravel()[0:3])

>>>(158561,)
>>>[ (20.781816482543945, 11.767952919006348, 15.565438270568848, 206, 216, 186, 255)
     (20.679922103881836, 11.754084587097168, 15.560364723205566, 189, 196, 157, 255)
     (20.72969627380371, 11.823691368103027, 15.51106071472168, 192, 193, 157, 255)]
>>>[ (20.781816482543945, 11.767952919006348, 15.565438270568848, 206, 216, 186, 255)
     (20.679922103881836, 11.754084587097168, 15.560364723205566, 189, 196, 157, 255)
     (20.72969627380371, 11.823691368103027, 15.51106071472168, 192, 193, 157, 255)]

So the ravel (I've also tried flatten, reshape etc) does work and I presume it is because the data types are (float, float, float, int, int, int).

What I have tried -I've tried doing things like vectorizing a function that just pulls out the xyz and rgb separately into a new array. -I've tried stack, vstack etc List comprehension (yuck) -Things like thes take 1 to 10s of seconds to execute compared to hundredths of seconds to read in the data. -I have tried using astype on the verts data, but that seems to return only the first element.

convert to structured array accessing first element of each element Most efficient way to map function over numpy array

What I want to Try/Would Like to Know

Is there a better way to read the data in the data in the first place so I don't loose all this time reshaping, flattening etc? Perhaps by telling np.fromfile to skip over the color data on one pass and then come back and read it again?

Is there a numpy trick I don't know for reshaping/flattening data of this kind

With a shape of (158561,) your array is already "flat", that is, 1d. So it's a waste of your time to try to change that with ravel, reshape, etc. You haven't taken seriously the meaning of array `shape`. — hpaulj, May 13 '20 at 17:26
Given the file structure, using `fromfile` with that compound `dtype` is the only way. Now you have a 1d structured array. The next question is - what do you need to do with that. The `dtype` defines fields, which you can access individually or in subsets, `points_np['x']` or `points_np[['x','y']]`. — hpaulj, May 13 '20 at 17:27
"which you can access individually or in subsets" Thank you, this was the hole in my understanding of the data I was getting back from "fromfile" — patmo141, May 13 '20 at 17:55

Cannot Flatten NumPy ndarray/ How to read binary file more intelligently

0 Answers0