0

i have a simple CNN model for my audio data. The final layer will need to be in size (none, 100) as i need it to concatenate with my text (current shape (none,128)). Here is my codes;

from keras.layers import Reshape

inputs_au = Input(shape=(637,20))
audio_model = Conv1D(100,3,activation='relu')(inputs_au)
audio_model = MaxPooling1D()(audio_model)
audio_model = Conv1D(100,3,activation='relu')(audio_model)
audio_model = MaxPooling1D()(audio_model)
audio_model = Conv1D(100,3,activation='relu')(audio_model)
audio_model = MaxPooling1D()(audio_model)
#audio_model = Reshape((100,))(audio_model)

model_audio = Model(inputs = inputs_au, outputs = audio_model)
model_audio.summary()

the output is:

enter image description here

I can't run 'reshape' as it is currently in shape (none,77,100). How can i make the final layer to (none,100)? Please advice

  • your current final output is of size (77, 100). If you want to have the final output in shape of (100) then you can either use FC layer or use another CNN layer – Hrushi Jul 13 '22 at 08:19
  • i add another pooling layer "audio_model = GlobalMaxPooling1D()(audio_model)".. it get me what i want. However im not sure if the features will be effected somehow... – Serena Taylor Jul 13 '22 at 09:28
  • i found a discussion that somehow help a bit with my question. just to share https://stackoverflow.com/questions/49295311/what-is-the-difference-between-flatten-and-globalaveragepooling2d-in-keras – Serena Taylor Jul 14 '22 at 07:46

0 Answers0