I am writing a numpy based .PLY importer. I am only interested in binary files, and vertices, faces and vertex colors. My target data format is a flattened list of x,y,z floats for the vertex data and r,g,b,a integers for the color data.
[x0,y0,z0,x1,y1,z1....xn,yn,zn]
[r0,g0,b0,a0,r1,g1,b1,a1....rn,gn,bn,an]
This allows me to use fasts builtin C++ methods to construct the mesh in the target program (Blender).
I am using a modified version of this code to read in the data into numpy arrays example
valid_formats = {'binary_big_endian': '>','binary_little_endian': '<'}
ply = open(filename, 'rb')
# get binary_little/big or ascii
fmt = ply.readline().split()[1].decode()
# get extension for building the numpy dtypes
ext = valid_formats[fmt]
ply.seek(end_header)
#v_dtype = [('x','<f4'),('y','<f4'), ('z','<f4'), ('red','<u1'), ('green','<u1'), ('blue','<u1'),('alpha','<u1')]
#points_size = (previously read in from header)
points_np = np.fromfile(ply, dtype=v_dtype, count=points_size)
The results being
print(points_np.shape)
print(points_np[0:3])
print(points_np.ravel()[0:3])
>>>(158561,)
>>>[ (20.781816482543945, 11.767952919006348, 15.565438270568848, 206, 216, 186, 255)
(20.679922103881836, 11.754084587097168, 15.560364723205566, 189, 196, 157, 255)
(20.72969627380371, 11.823691368103027, 15.51106071472168, 192, 193, 157, 255)]
>>>[ (20.781816482543945, 11.767952919006348, 15.565438270568848, 206, 216, 186, 255)
(20.679922103881836, 11.754084587097168, 15.560364723205566, 189, 196, 157, 255)
(20.72969627380371, 11.823691368103027, 15.51106071472168, 192, 193, 157, 255)]
So the ravel (I've also tried flatten, reshape etc) does work and I presume it is because the data types are (float, float, float, int, int, int).
What I have tried -I've tried doing things like vectorizing a function that just pulls out the xyz and rgb separately into a new array. -I've tried stack, vstack etc List comprehension (yuck) -Things like thes take 1 to 10s of seconds to execute compared to hundredths of seconds to read in the data. -I have tried using astype on the verts data, but that seems to return only the first element.
convert to structured array accessing first element of each element Most efficient way to map function over numpy array
What I want to Try/Would Like to Know
Is there a better way to read the data in the data in the first place so I don't loose all this time reshaping, flattening etc? Perhaps by telling np.fromfile to skip over the color data on one pass and then come back and read it again?
Is there a numpy trick I don't know for reshaping/flattening data of this kind