Tensorflow model for binary classification of 1D arrays

Question

I am trying to build a model to classify signals based on whether or not that signal has a specific pattern in it.

My signals are just arrays of 10000 floats each. I have 500 arrays that contain the pattern, and 500 that don't.

This is how I split my dataset :

X_train => Array of signals | shape => (800, 10000)

Y_train => Array of 1s and 0s | shape => (800,)

X_test => Array of signals | shape => (200, 10000)

Y_test => Array of 1s and 0s | shape => (200,)

(X for training, Y for validating)

The pattern is just a quick increase in values followed by a quick decrease in values like so (highlighted in red) :

Here is a signal without the pattern for reference :

I'm having a lot of trouble building a model since I'm used to classifying images (so 2D or 3D) but not just a series of points.

I've tried a simple sequential model like so :

# create model
model = Sequential()
model.add(Dense(10000, input_dim=10000, activation='relu'))
model.add(Dense(64, activation='relu'))
model.add(Dense(64, activation='relu'))
model.add(Dense(1, activation='sigmoid'))
# Compile model
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])

But it's failing completely. I'd love to implement a CNN but I have no idea how to do so. When re-using some CNNs I used for image classification, I'm getting a lot of errors concerning input, which I think is because its a 1D signal instead of a 2D or 3D image.

This is what happens with a CNN I used for image classification in the past :

model_random = tf.keras.models.Sequential()

model_random = tf.keras.Sequential([
  tf.keras.layers.Conv2D(32, [3,3], activation='relu'),
  tf.keras.layers.MaxPooling2D(),
  tf.keras.layers.Conv2D(16, [3,3], activation='relu'),
  tf.keras.layers.MaxPooling2D(),
  tf.keras.layers.Conv2D(8, [3,3], activation='relu'),
  tf.keras.layers.MaxPooling2D(),
  tf.keras.layers.Flatten(),
  tf.keras.layers.Dense(128, activation='relu'),
  tf.keras.layers.Dense(1)
])

model_random.compile(optimizer = 'adam',
              loss = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
              metrics=['accuracy'])

model_random.fit(X_train,Y_train, epochs=30)

    ValueError: Exception encountered when calling layer "sequential_8" (type Sequential).
    
    Input 0 of layer "conv2d_6" is incompatible with the layer: expected min_ndim=4, found ndim=2. Full shape received: (32, 10000)
    
    Call arguments received:
      • inputs=tf.Tensor(shape=(32, 10000), dtype=float32)
      • training=True
      • mask=None

@AlexeyTochin I didn't add it since I knew trying to feed 1d arrays in a model used for image classification was wrong but here's what I got from trying : (added in post) — Shiverz, Feb 04 '22 at 17:49

AloneTogether · Accepted Answer · 2022-02-04T18:12:40.553

The layer tf.keras.layers.Conv1D needs the following shape: (time_steps, features), so you have to decide what are your timesteps and features. Here is a starting point / dummy model where I assume that each sample has 10000 timesteps and each timestep one float feature:

import tensorflow as tf

X_train = tf.expand_dims(tf.random.normal((800, 10000)), axis=-1)
Y_train = tf.random.uniform((800, 1), maxval=2, dtype=tf.int32)

inputs = tf.keras.layers.Input((10000, 1))
x = tf.keras.layers.Conv1D(64, kernel_size=3, activation='relu')(inputs)
x = tf.keras.layers.Conv1D(32, kernel_size=3, activation='relu')(x)
x = tf.keras.layers.GlobalMaxPool1D()(x)
x = tf.keras.layers.Dense(64, activation='relu')(x)
outputs = tf.keras.layers.Dense(1, activation='sigmoid')(x)
model = tf.keras.Model(inputs, outputs)
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
model.fit(X_train, Y_train, batch_size=8, epochs=5)

Also, check out this post. In contrast to Conv2D layers, Conv1D layers apply filters to windows of n frames over the height of a tensor, while the width of the filter remains fixed. You should also consider downsampling your signals, because 10000 timesteps is a lot for one sample.

Thanks for your help. This compiles and trains without error, but I'm still getting poor accuracy results. Maybe because of the input size of 10000, as you suggested. Any tips on where to start looking for downsampling my signals? — Shiverz, Feb 07 '22 at 08:47
Maybe here is a good starting point: https://stackoverflow.com/questions/34231244/downsampling-a-2d-numpy-array-in-python/34232507. Generally, google downsampling arrays — AloneTogether, Feb 07 '22 at 08:51

Tensorflow model for binary classification of 1D arrays

1 Answers1