6

My input is a array of 64 integers.

model = Sequential()
model.add( Input(shape=(68,), name="input"))
model.add(Conv1D(64, 2, activation="relu", padding="same", name="convLayer"))

I have 10,000 of these arrays in my training set. And I supposed to be specifying this in order for conv1D to work?

I am getting the dreaded

ValueError: Input 0 of layer convLayer is incompatible with the layer: : expected min_ndim=3, found ndim=2. Full shape received: [None, 68]

error and I really don't understand what I need to do.

AloneTogether
  • 25,814
  • 5
  • 20
  • 39
Tony Ennis
  • 12,000
  • 7
  • 52
  • 73

2 Answers2

11

Don't let the name confuse you. The layer tf.keras.layers.Conv1D needs the following shape: (time_steps, features). If your dataset is made of 10,000 samples with each sample having 64 values, then your data has the shape (10000, 64), which is not directly applicable to the tf.keras.layers.Conv1D layer. You are missing the time_steps dimension. What you can do is use the tf.keras.layers.RepeatVector, which repeats your array input n times, in the example 5. This way your Conv1D layer gets an input of the shape (5, 64). Check out the documentation for more information:


time_steps = 5
model = tf.keras.Sequential()
model.add(tf.keras.layers.Input(shape=(64,), name="input"))
model.add(tf.keras.layers.RepeatVector(time_steps))
model.add(tf.keras.layers.Conv1D(64, 2, activation="relu", padding="same", name="convLayer"))

As a side note, you should ask yourself if using a tf.keras.layers.Conv1D layer is the right option for your use case. This layer is usually used for NLP and other time series tasks. For example, in sentence classification, each word in a sentence is usually mapped to a high-dimensional word vector representation, as seen in the image. This results in data with the shape (time_steps, features).

                                          enter image description here

If you want to use character one hot encoded embeddings it would look something like this:

                                          enter image description here

This is a simple example of one single sample with the shape (10, 10) --> 10 characters along the time series dimension and 10 features. It should help you understand the tutorial I mentioned a bit better.

AloneTogether
  • 25,814
  • 5
  • 20
  • 39
  • Thank you so much. I am struggling understanding the documentation. I'm a beginner and am experimenting. My particular project us to read a 64 character string and determine if it is an email address, URL, integer, float, word, or noise. Conv1D is good for this, right? In my experiment, I am hoping to discover important markers such as "@", "://", ".", etc – Tony Ennis Oct 16 '21 at 14:47
  • One thing I cannot wrap my head around is why I should specify my number of training samples. I solved the problem above already using a big ol' stack of Dense layers and it didn't care a bit about the samples. Is the kernel moving down a single data vector (a single email address) or is it moving down the first character of each sample? – Tony Ennis Oct 16 '21 at 14:50
  • To your first question: Yes, it is definitely possible if you split each email address into characters (time series dimension) and use for example one hot encoding embedding (feature dimension). Please refer to this [tutorial](https://towardsdatascience.com/character-level-cnn-with-keras-50391c3adf33) – AloneTogether Oct 16 '21 at 15:04
  • 2
    To your second question: where exactly are you specifying the number of your training samples? – AloneTogether Oct 16 '21 at 15:07
  • Ok, I malfunctioned. Previous thing I tried seemed to want that info; not now. I misunderstood the "repeatVector" thing, evidently thinking that this was for the data set size I had previously specified. Only now do I see the 64, lol. I'll have to read up on that. – Tony Ennis Oct 16 '21 at 15:25
  • Thank you, my "syntax" errors are corrected, now I can begin understanding what this is doing. – Tony Ennis Oct 16 '21 at 15:33
0

The Conv1D layer does temporal convolution, that is, along the first dimension (not the batch dimension of course), so you should put something like this:

time_steps = 5
model = tf.keras.Sequential()
model.add(tf.keras.layers.Input(shape=(time_steps, 64), name="input"))
model.add(tf.keras.layers.Conv1D(64, 2, activation="relu", padding="same", name="convLayer"))

You will need to slice your data into time_steps temporal slices to feed the network.

However, if your arrays don't have a temporal structure, then conv1D is not the layer you are looking for.

elbe
  • 1,363
  • 1
  • 9
  • 13