Arrays of 3D and 2D as input of 3DCONV

Question

Sorry if this is a very basic question, I do not have much experience with neural networks and that is why I am asking this question. I have a voxel grid of size (10,10,10) that describes with 0,1,2 the space of a bin. Besides I have 4 arrays with size (10,10) that are: the height map of the bin, and the dimensions length, width and height of a box.

By suggestion of an article about the bin packing problem I want to apply the 3DConv network that has as input the voxel grid and the height map arrays and the box dimensions arrays.

The article I read does not specify the number of channels. I define 14 channels which are the voxel grid height (10), height map(1), length(1), width(1), height(1). For this I divided the voxel grid by its height and modified the size of the height map arrays and the dimensions of the box. This is part of the code of the gymnasium that does what I described:

def current_observation(self):
    npl, npw, nph = self.matrix_dimension_box()
    
    npl = npl.reshape(1,self.bin_size[0],self.bin_size[1],1)
    npw = npw.reshape(1,self.bin_size[0], self.bin_size[1], 1)
    nph = nph.reshape(1,self.bin_size[0],self.bin_size[1],1)
    hm = self.state_layer.reshape(1,self.bin_size[0],self.bin_size[1],1)

    vg = np.split(self.state_voxel_grid.reshape(self.bin_size[2],self.bin_size[0],self.bin_size[1],1),self.bin_size[2],axis=0)

    return np.reshape(np.stack((*vg,hm,npl,npw,nph)),newshape=(-1))

Then in my neural network I reshaped the array to this shape and from there fed my 3DConv network.

x = x.reshape((-1,channel, container_size[0], container_size[1],1))

In total I am using 14 channels. Please could you tell me if what I did is correct? or how should I modify and concatenate these arrays in the 3DConv?

Arrays of 3D and 2D as input of 3DCONV

0 Answers0