1

I am trying to build a model to classify signals based on whether or not that signal has a specific pattern in it.

My signals are just arrays of 10000 floats each. I have 500 arrays that contain the pattern, and 500 that don't.

This is how I split my dataset :

X_train => Array of signals | shape => (800, 10000)

Y_train => Array of 1s and 0s | shape => (800,)

X_test => Array of signals | shape => (200, 10000)

Y_test => Array of 1s and 0s | shape => (200,)

(X for training, Y for validating)

The pattern is just a quick increase in values followed by a quick decrease in values like so (highlighted in red) : enter image description here

Here is a signal without the pattern for reference : enter image description here

I'm having a lot of trouble building a model since I'm used to classifying images (so 2D or 3D) but not just a series of points.

I've tried a simple sequential model like so :

# create model
model = Sequential()
model.add(Dense(10000, input_dim=10000, activation='relu'))
model.add(Dense(64, activation='relu'))
model.add(Dense(64, activation='relu'))
model.add(Dense(1, activation='sigmoid'))
# Compile model
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])

But it's failing completely. I'd love to implement a CNN but I have no idea how to do so. When re-using some CNNs I used for image classification, I'm getting a lot of errors concerning input, which I think is because its a 1D signal instead of a 2D or 3D image.

This is what happens with a CNN I used for image classification in the past :

model_random = tf.keras.models.Sequential()

model_random = tf.keras.Sequential([
  tf.keras.layers.Conv2D(32, [3,3], activation='relu'),
  tf.keras.layers.MaxPooling2D(),
  tf.keras.layers.Conv2D(16, [3,3], activation='relu'),
  tf.keras.layers.MaxPooling2D(),
  tf.keras.layers.Conv2D(8, [3,3], activation='relu'),
  tf.keras.layers.MaxPooling2D(),
  tf.keras.layers.Flatten(),
  tf.keras.layers.Dense(128, activation='relu'),
  tf.keras.layers.Dense(1)
])

model_random.compile(optimizer = 'adam',
              loss = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
              metrics=['accuracy'])

model_random.fit(X_train,Y_train, epochs=30)
    ValueError: Exception encountered when calling layer "sequential_8" (type Sequential).
    
    Input 0 of layer "conv2d_6" is incompatible with the layer: expected min_ndim=4, found ndim=2. Full shape received: (32, 10000)
    
    Call arguments received:
      • inputs=tf.Tensor(shape=(32, 10000), dtype=float32)
      • training=True
      • mask=None
desertnaut
  • 57,590
  • 26
  • 140
  • 166
Shiverz
  • 663
  • 1
  • 8
  • 23
  • Please, implement the input and show the error log. – Alexey Tochin Feb 04 '22 at 17:44
  • @AlexeyTochin I didn't add it since I knew trying to feed 1d arrays in a model used for image classification was wrong but here's what I got from trying : (added in post) – Shiverz Feb 04 '22 at 17:49

1 Answers1

1

The layer tf.keras.layers.Conv1D needs the following shape: (time_steps, features), so you have to decide what are your timesteps and features. Here is a starting point / dummy model where I assume that each sample has 10000 timesteps and each timestep one float feature:

import tensorflow as tf

X_train = tf.expand_dims(tf.random.normal((800, 10000)), axis=-1)
Y_train = tf.random.uniform((800, 1), maxval=2, dtype=tf.int32)

inputs = tf.keras.layers.Input((10000, 1))
x = tf.keras.layers.Conv1D(64, kernel_size=3, activation='relu')(inputs)
x = tf.keras.layers.Conv1D(32, kernel_size=3, activation='relu')(x)
x = tf.keras.layers.GlobalMaxPool1D()(x)
x = tf.keras.layers.Dense(64, activation='relu')(x)
outputs = tf.keras.layers.Dense(1, activation='sigmoid')(x)
model = tf.keras.Model(inputs, outputs)
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
model.fit(X_train, Y_train, batch_size=8, epochs=5)

Also, check out this post. In contrast to Conv2D layers, Conv1D layers apply filters to windows of n frames over the height of a tensor, while the width of the filter remains fixed. You should also consider downsampling your signals, because 10000 timesteps is a lot for one sample.

AloneTogether
  • 25,814
  • 5
  • 20
  • 39
  • Thanks for your help. This compiles and trains without error, but I'm still getting poor accuracy results. Maybe because of the input size of 10000, as you suggested. Any tips on where to start looking for downsampling my signals? – Shiverz Feb 07 '22 at 08:47
  • 1
    Maybe here is a good starting point: https://stackoverflow.com/questions/34231244/downsampling-a-2d-numpy-array-in-python/34232507. Generally, google downsampling arrays – AloneTogether Feb 07 '22 at 08:51