I am using the Mycroft AI wake word detection and I am trying to understand the dimensions of the network. The following lines show the model in Keras:
model = Sequential()
model.add(GRU(
params.recurrent_units, activation='linear',
input_shape=(pr.n_features, pr.feature_size), dropout=params.dropout, name='net'))
model.add(Dense(1, activation='sigmoid'))
My features have a size of 29*13. The GRU layer has 20 units. My question is now, how can my model have 2040 learnable parameters in the GRU layer? How are the units connected? Maybe my overall understanding of a GRU network is wrong, but I can only find explanations of a single cell, and never of the full network. Is the GRU network fully connected? Thank You!