1

TLDR: I want to predict whether a machine will fail based on the most recent set of measurements taken by on-board sensors. But I'm having trouble understanding what the shape of my input tensors should be as well as how to properly flatten my data between the final convolutional layer and the first fully connected layer.

The Data

I have several dozen machines, each with a sensor that takes a measurement at regular intervals. Some machines have already failed, but most have not. The resulting dataset looks something like the example data below, with one row for each machine, showing the 30 most recent sensor measurements as well as a "failure" designation, where 0 indicates that the machine is still operational, and 1 indicates that the machine failed after the measurement taken at time30.

    ID  time1   time2   time3   time4   time5   time6   time7   time8   time9   time10  time11  time12  time13  time14  time15  time16  time17  time18  time19  time20  time21  time22  time23  time24  time25  time26  time27  time28  time29  time30  failure
0   1   3.085   1.360   2.351   3.858   5.562   3.709   6.423   9.706   5.521   0.045   5.676   6.045   5.540   8.404   2.701   7.969   2.535   5.096   7.949   5.888   9.250   6.608   1.441   2.066   8.885   6.985   1.310   4.245   9.068   3.283   0
1   2   7.938   9.833   5.776   3.218   0.978   4.164   8.079   7.425   5.554   0.259   5.927   5.168   8.751   8.713   5.651   9.342   0.385   6.623   4.348   9.113   9.230   7.134   4.316   4.725   9.258   4.248   6.497   7.354   7.707   2.527   0
2   3   5.946   0.096   1.972   6.362   9.990   6.702   9.683   5.111   2.273   7.581   0.379   5.571   0.274   9.429   3.572   2.032   0.543   0.467   3.028   1.095   0.529   8.780   4.375   7.544   0.754   5.400   4.943   1.821   1.486   2.492   1
3   4   6.793   9.299   1.522   9.307   0.438   9.999   0.481   6.420   3.881   4.933   7.185   4.176   4.224   7.403   9.101   3.300   3.273   0.556   6.421   5.528   9.262   6.160   1.573   9.299   4.307   0.808   4.270   6.886   3.548   4.889   0
4   5   8.470   5.503   7.420   8.363   3.316   1.047   9.695   3.884   2.010   8.353   1.308   7.733   7.898   3.327   2.737   2.858   2.002   5.483   7.750   4.952   2.435   5.980   6.403   0.985   1.591   8.886   7.586   0.062   6.002   1.144   1

The Problem(s)

I want to use PyTorch to create a 1D convolutional neural network that will predict whether a machine is about to fail based on the 30 most recent sensor measurements. While building my CNN, I've run into a couple of points of confusion:

  • Understanding the necessary dimensionality of the input tensors: Should my input tensor have the shape of (num_of_samples, 1, 30), or (num_of_samples, 30, 1)?
  • Understanding the flattening step after the final convolutional layer and before the first fully connected layer: Many PyTorch CNN examples I've seen include a flattening/reshaping step where the view function is used. I've included such a step in my code below, but I don't know if I'm using it correctly since I often get RuntimeError: mat1 and mat2 shapes cannot be multiplied after this step.

What I Have So Far

After splitting the dataframe data into train, validation, and test sets, I can check the dimensionality:

np.shape(X_train)
# out: (38, 30)

I then add another dimension and convert to arrays:

def extra_layer(arr):
    arr = np.reshape(arr, (len(arr), len(arr[0]), 1))
    return arr

X_train = extra_layer(X_train)
X_validate = extra_layer(X_validate)
X_test = extra_layer(X_test)

y_train = y_train.astype(int)
y_validate = y_validate.astype(int)
y_test = y_test.astype(int)

np.shape(X_train)
# out: (38, 30, 1)

I then reshape the data so each timepoint measurement is the final dimension:

'''
Reshaping data
'''
reshape = 30

X_train = X_train.reshape(len(X_train), 1, reshape)
X_validate = X_validate.reshape(len(X_validate), 1, reshape)
X_test = X_test.reshape(len(X_test), 1, reshape)

np.shape(X_train)
# out: (38, 1, 30)

Then I zip my arrays together and build my CNN:

'''
Zipping data together and storing in trainloader objects
'''
train_set = list(zip(X_train, y_train))
val_set = list(zip(X_validate, y_validate))
test_set = list(zip(X_test, y_test))
print('Done zipping and converting')

'''
Creating the neural net
'''


class CNN(nn.Module):

    def __init__(self):
        # Outputs
        self.O_1 = 17
        self.O_2 = 18
        self.O_3 = 32
        self.O_4 = 37
        
        # Kernels
        self.K_1 = 2
        self.K_2 = 1
        self.K_3 = 3
        self.K_4 = 2
        
        # Pooling
        self.KP_1 = 1
        self.KP_2 = 1
        self.KP_3 = 1
        self.KP_4 = 2
        
        # Padding
        self.P_1 = 0
        self.P_2 = 0
        self.P_3 = 0
        self.P_4 = 0

        self.conv_1_out = m.floor(((reshape + 2*self.P_1 - self.K_1)/self.KP_1)+1)
        self.conv_2_out = m.floor(((self.conv_1_out + 2*self.P_2 - self.K_2)/self.KP_2)+1)
        self.conv_3_out = m.floor(((self.conv_2_out + 2*self.P_3 - self.K_3)/self.KP_3)+1)
        self.conv_4_out = m.floor(((self.conv_3_out + 2*self.P_4 - self.K_4)/self.KP_4)+1)
        self.conv_linear_out = int(m.floor((self.conv_4_out)*self.O_4))

        self.FN_1 = 50

        super(CNN, self).__init__()

        self.conv1 = nn.Sequential(nn.Conv1d(1, self.O_1, self.K_1), nn.ReLU(),
                                   nn.MaxPool1d(self.KP_1))
        self.conv2 = nn.Sequential(nn.Conv1d(self.O_1, self.O_2, self.K_2), nn.ReLU(),
                                   nn.MaxPool1d(self.KP_2))
        self.conv3 = nn.Sequential(nn.Conv1d(self.O_2, self.O_3, self.K_3), nn.ReLU(),
                                   nn.MaxPool1d(self.KP_3))
        self.conv4 = nn.Sequential(nn.Conv1d(self.O_3, self.O_4, self.K_4), nn.ReLU(),
                                   nn.MaxPool1d(self.KP_4))
        self.fc1 = nn.Linear(self.conv_linear_out, self.FN_1, nn.Dropout(0.2))
        self.fc2 = nn.Linear(self.FN_1, 2)

    def forward(self, x):
        x = x.float()
        x = F.leaky_relu(self.conv1(x))
        x = F.leaky_relu(self.conv2(x))
        x = F.leaky_relu(self.conv3(x))
        x = F.leaky_relu(self.conv4(x))
        x = x.view(len(x), -1) # SHOULD THE -1 BE THE FIRST VALUE?
        x = F.logsigmoid(self.fc1(x))
        x = self.fc2(x)

        return x

Finally, I train the CNN:

'''
If gpu is available we will use it
'''
use_cuda = True

'''
Train
'''

best_accuracy = -float('Inf')
best_params = []

batch_size = 5

trainloader = torch.utils.data.DataLoader(
    train_set, batch_size=batch_size, shuffle=True, num_workers=2)
vldloader = torch.utils.data.DataLoader(
    val_set, batch_size=batch_size, shuffle=True, num_workers=2)
testloader = torch.utils.data.DataLoader(
    test_set, batch_size=batch_size, shuffle=True, num_workers=2)

lr = 0.001
epochs = 250
momentum = 0.7557312793639288

net = CNN()

if use_cuda and torch.cuda.is_available():
    net.cuda()

criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(net.parameters(), lr=lr, momentum=momentum)

for epoch in range(250):  # loop over the dataset multiple times

    running_loss = 0.0
    for i, data in enumerate(trainloader, 0):
        inputs, labels = data
        if use_cuda and torch.cuda.is_available():
            inputs = inputs.cuda()
            labels = labels.cuda()

        optimizer.zero_grad()
        outputs = net(inputs)
        outputs = outputs.to(dtype=torch.float64)
        labels = labels.to(dtype=torch.long)
        loss = criterion(outputs, labels)
        loss.backward()
        optimizer.step()

    print('Finished epoch number ' + str(epoch))

    correct = 0
    total = 0
    with torch.no_grad():
        for data in vldloader:
            inputs, labels = data
            inputs = inputs.cuda()
            labels = labels.cuda()
            outputs = net(inputs)
            _, predicted = torch.max(outputs.data, 1)
            total += len(labels)
            correct += (predicted == labels).sum().item()

    print('Accuracy of the network on the validation set: %d %%'
          % (100 * correct / total))

Does the structure of my CNN make sense? Should I alter the dimensions of my input tensors? Am I using the view function correctly or can I alter my code to something that makes more sense?

Rory McGuire
  • 151
  • 9

0 Answers0