0

Say I have some feature extracted and it is 10x10 data(maybe image or cepstrogram). Usually I would feed this into my 2DConv and i ll be on my way.

My quesiton is if I had to convert this into 1D of 100 inputs what disadvantages would I get besides the obvious part where my filter would not be detecting the surrounding neighboors but only the previous and the next ones to detect pattern, which might lead to a worse performance.

And If I had to do this though, would I just reshape ,use reshape layer or use permute layer ?

Thanks

Evren Bingøl
  • 1,306
  • 1
  • 20
  • 32
  • The spcial information information is lost on 2d to 1d transformation. If you dont need that then it is good to do so (may to look in to receptive field). The question is where are you using this ? I mean is it specific step in your Neural network or are planning for complete 1d network ? – Nithin Varghese Mar 26 '21 at 04:23

2 Answers2

0

For advantages & disadvantages of 2D/1D CNN you may refer to this detailed thread

In TensorFlow, these are the process to build CNN architecture:

  1. Reshape input if necessary using tf.reshape() to match the convolutional layer you intend to build (for example, if using a 2D convolution, reshape it into three-dimensional format)

  2. Create a convolutional layer using tf.nn.conv1d(), tf.nn.conv2d(), or tf.nn.conv3d, depending on the dimensionality of the input.

  3. Create a poling layer using tf.nn.maxpool()

  4. Repeat steps 2 and 3 for additional convolution and pooling layers

  5. Reshape output of convolution and pooling layers, flattening it to prepare for the fully connected layer

  6. Create a fully connected layer using tf.matmul() function, add an activation using, for example, tf.nn.relu() and apply a dropout using tf.nn.dropout()

  7. Create a final layer for class prediction, again using tf.matmul()

  8. Store weights and biases using TensorFlow variables These are just the basic steps to create the CNN model, there are additional steps to define training and evaluation, execute the model and tune it

In step 2 of CNN development you create convolutional layer of 2D using tf.nn.conv2d() - this function Computes a 2-D convolution given 4-D input and filters tensors.

So if you have 1D vector as found in examples of MNIST datadet with 784 features, you can convert 1D vector to 4D input required for conv2d() function using the tensorflow reshape method, Reshape method converts to match picture format [Height x Width x Channel], then Tensor input become 4-D: [Batch Size, Height, Width, Channel]:

x = tf.reshape(x, shape=[-1, 28, 28, 1])

where x is placeholder vector

x = tf.placeholder(tf.float32, [None, num_input])

You may refer to the official Tensorflow documentation

Rommel_Intel
  • 1,369
  • 1
  • 4
  • 8
  • Thanks for replying to this. My issue is that I am starting to use GNA and GNA supports 1DCNNs and almost most common sound features are in 2D spectogram or cepstogram. And when we use GNA we need to convert them to 1D, As 2D is experimental. First of should I basically stick to 2D and test and maybe use Permute Layer as spcified in the docs since GNA's input is different then Openvino. Or should I convert my input to 2D to 1D and cut my losses as 1D inference scores worse for me for the same test on CPU – Evren Bingøl Mar 26 '21 at 14:21
0

Yes, you are correct regarding the GNA, our Intel GNA hardware is natively support only 1D convolution and 2D convolutions is experimental.

This article (GNA Plugin - OpenVINO™ Toolkit) specifies the steps to add Permute layers before or after convolutions.

You could try both methods and see which one works for you.

Generally,the 1d convolution in TensorFlow is created with 2d convolution wrapping in reshape layers to add H dimension before 2d convolution and remove it after that.

At the same time MO inserts permutes before and after reshape layers since they change the interpretation of data.

Rommel_Intel
  • 1,369
  • 1
  • 4
  • 8