0

I am trying to build a wakeword model for my AI Assistant, but i dont know which output i should give it to my Linear Layer. What is the difference between them and why should i use the recommendatioin of yours?

deep freeze
  • 51
  • 1
  • 6
  • 2
    Does this answer your question? [What's the difference between "hidden" and "output" in PyTorch LSTM?](https://stackoverflow.com/questions/48302810/whats-the-difference-between-hidden-and-output-in-pytorch-lstm) – ndrwnaguib Mar 20 '22 at 06:27

1 Answers1

0

You should give the output to the linear layer instead of hidden state output. Like this (time series prediction):

def forward(self, input_seq):
    h_0 = torch.randn(self.num_directions * self.num_layers, self.batch_size, self.hidden_size).to(device)
    c_0 = torch.randn(self.num_directions * self.num_layers, self.batch_size, self.hidden_size).to(device)
    seq_len = input_seq.shape[1]
    input_seq = input_seq.view(self.batch_size, seq_len, 1)
    output, _ = self.lstm(input_seq, (h_0, c_0))
    output = output.contiguous().view(self.batch_size * seq_len, self.hidden_size)
    pred = self.linear(output)
    pred = pred.view(self.batch_size, seq_len, -1)
    pred = pred[:, -1, :]
    return pred

The output contains the hidden state output at all time steps in the last layer, and the hidden state output is only the hidden states at the last time step.

ki-ljl
  • 499
  • 2
  • 9
  • What if the word is too small in that audio? shoudn't i just collect the couple timesteps of hidden state output and predict with that? – deep freeze Mar 20 '22 at 21:46
  • In language modeling you need the hidden state output at all time steps. – ki-ljl Mar 21 '22 at 02:05