handle labels for a multi-headed CNN

Question

I'm trying to build a "multi-headed CNN model" so each head is a branch taking in individual multi variant time series Data.

What is not clear to me how to handle the "fit" method or in other words how to properly prepare the y_train. There are 2 classes 0 and 1 for the label

The current architecture is as shown here. neural net architecture

the goal is to predict one time_step ahead

input shapes are:

A Training Data (1, 903155, 5)
B Training Data (1, 903116, 5)
C Training Data (1, 902996, 5)

label shapes:

y_train (903155, 1)

when doing:
history = model.fit(x = [A,B,C], y = y_in)

than i get: Input arrays should have the same number of samples as target arrays. Found 1 input samples and 903155 target samples.

reshaping y_in to (1, 903155) results in:

 expected dense_5 (see image) to have shape (1,) but got array with shape (903155,)

strange is that model.predict([A,B,C]) yield results

score 2 · Answer 1 · answered Oct 14 '19 at 14:05

2

The problem's rooted in mishandling the batch dimension; the first dimension of all your layers, and of your data, is the batch dimension. Error explanations below.

Solution:

y_train (903155, 1) is correct, but A, B, and C probably aren't: they specify ONE sample with dimensions (903155, 5), (903116, 5), and (902996, 5), respectively. I doubt this is desired - more likely, the 90... are batch dimensions and should be reshaped as: (903155, 1, 5), etc.

Error explanations:

Error 1 is saying that you fed 1 sample (training example) as an input, but 903155 labels
Error 2 is saying that dense_5 output is shaped (None, 1), but is expected to be compared against (None, 903155) when computing loss

Input arrays should have the same number of samples as target arrays. Found 1 input samples and 903155 target samples. # Error 1

expected dense_5 to have shape (1,) but got array with shape (903155,) # Error 2

answered Oct 14 '19 at 14:05

OverLordGoldDragon

1
9
53
101

thanks for the detailed answer! But actually it is intended the input in this case is one sample each so A,B,C are each in the shape (1 sample, X Time steps , and 5 Features ) Since the ```model.predict``` method returns a result the issue must lay in the y_in but i cant figure what it is – Sentinan Oct 14 '19 at 16:01
@Sentinan In that case, I don't know what `y_train` even is; for one-step prediction, you have only _one label_ per batch, shaped `(batch_size, 1, features)` - for you, `(1, 1, 5)`. Double-check your problem definition - let me know if still unsure – OverLordGoldDragon Oct 14 '19 at 16:13
ok so when i change the last fully connected layer to output (None, 903155) than indeed i don't get an error and the model start the training. This means right now the model will output 903155 predictions based on one sample is that correct? Desired is: ```take a batch of x time steps and make a prediction for t+1``` – Sentinan Oct 14 '19 at 16:27
@Sentinan That's very far from it; again, your output shape must be `(batch_size, 1, features)` - you are predicting _one_ timestep given `x` timesteps. See [this answer](https://stackoverflow.com/questions/58276337/proper-way-to-feed-time-series-data-to-stateful-lstm/58277760#58277760) for reference – OverLordGoldDragon Oct 14 '19 at 16:42
@Sentinan Nope, one output will do; in general, for a model with X `Input`s and `Y` distinct outputs, you feed X and Y input data and labels, respectively - in your case, 3 and 1. – OverLordGoldDragon Oct 14 '19 at 18:49
I can accept your answer you brought me back on the right track. My input series is now something like:```(903146, 10, 5) -- (n_sequences, timesteps_per_sequence, n_features)``` the input is shape (903146, 1). This should yield the expected behavior, which is make a prediction for every sequence of 10 time steps!? – Sentinan Oct 14 '19 at 19:38
1

@Sentinan Do you mean the output shape is `(903146, 1)`? And input looks fine right now; the way it should work is, `fit` defaults `batch_size=32` if you don't specify, and will fit slices shaped `(32, 10, 5)` at a time - for a total of `903146//32 = 28200` iterations. That's fairly large, consider setting `batch_size` to something larger, e.g. `128, 256`. As for prediction, it'll predict on as many samples as you feed (unless you specify `batch_shape` for `Input`, then you must feed `batch_size` number of samples) – OverLordGoldDragon Oct 14 '19 at 20:10
yes correct: output shape is (903146, 1), and great suggestion for batch size testing it. The model looks now as follows: https://i.imgur.com/IYLVtY8.png the model is now "training" but doesn't do any progress it starts with a loss of 0.48740 and stays there no matter how many epochs. Do you have any suggestion on that? – Sentinan Oct 15 '19 at 11:07
@Sentinan That's a separate question, with scope substantially beyond this one. If you open it as a new question, you can reply here to notify me - I should be able to assist. But briefly: I'd mainly question how you acquired the new input shapes, `(60, 4)`, etc - if via a simple reshape, that could be the root of the problem. – OverLordGoldDragon Oct 15 '19 at 13:38

score -1 · Accepted Answer · answered Oct 17 '19 at 18:48

-1

After some digging I found the answer as OverLordGoldDragon mentioned the input and output shape was not correct.

A 1d Conv expect the input data always as: (Samples, timesteps_per_sample, features_per_sample)

so in my case (1, 903155, 5) --> (903155, 1, 5)

answered Oct 17 '19 at 18:48

Sentinan

89
2
7

What we do if an answer can be better framed is ask the author to edit it, not copy its part and deprive it of credit - especially if author provided further assistance through comments. If you solved the problem largely without my input, it's a different story, but not the case here. – OverLordGoldDragon Oct 17 '19 at 19:13

handle labels for a multi-headed CNN

2 Answers2