1

I followed this great answer for sequence autoencoder,

LSTM autoencoder always returns the average of the input sequence.

but I met some problem when I try to change the code:

  1. question one: Your explanation is so professional, but the problem is a little bit different from mine, I attached some code I changed from your example. My input features are 2 dimensional, and my output is same with the input. for example:
input_x = torch.Tensor([[0.0,0.0], [0.1,0.1], [0.2,0.2], [0.3,0.3], [0.4,0.4]])
output_y = torch.Tensor([[0.0,0.0], [0.1,0.1], [0.2,0.2], [0.3,0.3], [0.4,0.4]])
the input_x and output_y are same, 5-timesteps, 2-dimensional feature.

        import torch
        import torch.nn as nn
        import torch.optim as optim

        class LSTM(nn.Module):
            def __init__(self, input_dim, latent_dim, num_layers):
                super(LSTM, self).__init__()
               self.input_dim = input_dim
                self.latent_dim = latent_dim
                self.num_layers = num_layers
                self.encoder = nn.LSTM(self.input_dim, self.latent_dim, self.num_layers)

                # I changed here, to 40 dimesion, I think there is some problem 
                # self.decoder = nn.LSTM(self.latent_dim, self.input_dim, self.num_layers)
                self.decoder = nn.LSTM(40, self.input_dim, self.num_layers)

            def forward(self, input):
                # Encode
                _, (last_hidden, _) = self.encoder(input)
                # It is way more general that way
                encoded = last_hidden.repeat(input.shape)
                # Decode
                y, _ = self.decoder(encoded)
               return torch.squeeze(y)

        model = LSTM(input_dim=2, latent_dim=20, num_layers=1)
        loss_function = nn.MSELoss()
        optimizer = optim.Adam(model.parameters())
        y = torch.Tensor([[0.0,0.0], [0.1,0.1], [0.2,0.2], [0.3,0.3], [0.4,0.4]])
        x = y.view(len(y), -1, 2)   # I changed here 

        while True:
            y_pred = model(x)
            optimizer.zero_grad()
            loss = loss_function(y_pred, y)
            loss.backward()
            optimizer.step()
            print(y_pred)

The above code can learn very well, can you help review the code and give some instructions.

When I input 2 examples as the input to the model, the model cannot work:

for example, change the code:

y = torch.Tensor([[0.0,0.0], [0.1,0.1], [0.2,0.2], [0.3,0.3], [0.4,0.4]])

to:

y = torch.Tensor([[[0.0,0.0],[0.5,0.5]], [[0.1,0.1], [0.6,0.6]], [[0.2,0.2],[0.7,0.7]], [[0.3,0.3],[0.8,0.8]], [[0.4,0.4],[0.9,0.9]]])

When I compute the loss function, it complain some errors? can anyone help have a look

  1. question two: my training samples are with different length: for example:
x1 = [[0.0,0.0], [0.1,0.1], [0.2,0.2], [0.3,0.3], [0.4,0.4]]   #with 5 timesteps
x2 = [[0.5,0.5], [0.6,0.6], [0.7,0.7]] #with only 3 timesteps

How can I input these two training sample into the model at the same time for a batch training.

morgan121
  • 2,213
  • 1
  • 15
  • 33
Jun Ren
  • 11
  • 2

1 Answers1

0

Recurrent N-dimensional autoencoder

First of all, LSTMs work on 1D samples, yours are 2D as it's usually used for words encoded with a single vector.

No worries though, one can flatten this 2D sample to 1D, example for your case would be:

import torch

var = torch.randn(10, 32, 100, 100)
var.reshape((10, 32, -1))  # shape: [10, 32, 100 * 100]

Please notice it's really not general, what if you were to have 3D input? Snippet belows generalizes this notion to any dimension of your samples, provided the preceding dimensions are batch_size and seq_len:

import torch

input_size = 2

var = torch.randn(10, 32, 100, 100, 35)
var.reshape(var.shape[:-input_size] + (-1,)) # shape: [10, 32, 100 * 100 * 35]

Finally, you can employ it inside neural network as follows. Look at forward method especially and constructor arguments:

import torch


class LSTM(nn.Module):
    # input_dim has to be size after flattening
    # For 20x20 single input it would be 400
    def __init__(
        self,
        input_dimensionality: int,
        input_dim: int,
        latent_dim: int,
        num_layers: int,
    ):
        super(LSTM, self).__init__()
        self.input_dimensionality: int = input_dimensionality
        self.input_dim: int = input_dim  # It is 1d, remember
        self.latent_dim: int = latent_dim
        self.num_layers: int = num_layers
        self.encoder = torch.nn.LSTM(self.input_dim, self.latent_dim, self.num_layers)
        # You can have any latent dim you want, just output has to be exact same size as input
        # In this case, only encoder and decoder, it has to be input_dim though
        self.decoder = torch.nn.LSTM(self.latent_dim, self.input_dim, self.num_layers)

    def forward(self, input):
        # Save original size first:
        original_shape = input.shape
        # Flatten 2d (or 3d or however many you specified in constructor)
        input = input.reshape(input.shape[: -self.input_dimensionality] + (-1,))

        # Rest goes as in my previous answer
        _, (last_hidden, _) = self.encoder(input)
        encoded = last_hidden.repeat(input.shape)
        y, _ = self.decoder(encoded)

        # You have to reshape output to what the original was
        reshaped_y = y.reshape(original_shape)
        return torch.squeeze(reshaped_y)

Remember you have to reshape your output in this case. It should work for any dimensions.

Batching

When it comes to batching and different length of sequences it is a little more complicated.

You have to pad each sequence in batch before pushing it through network. Usually, values with which you pad are zeros, you may configure it inside LSTM though.

You may check this link for an example. You will have to use functions like torch.nn.pack_padded_sequence and others to make it work, you may check this answer.

Oh, since PyTorch 1.1 you don't have to sort your sequences by length in order to pack them. But when it comes to this topic, grab some tutorials, should make things clearer.

Lastly: Please, separate your questions. If you perform the autoencoding with single example, move on to batching and if you have issues there, please post a new question on StackOverflow, thanks.

Szymon Maszke
  • 22,747
  • 4
  • 43
  • 83