0

I would like to predict a multi-dimensional array using Long Short-Term Memory (LSTM) networks while imposing restrictions on the shape of the surface of interest.

I thought to accomplish this by setting some elements of the output (regions of the surface) in a functional relationship to others (simple scaling conditions).

Is it possible to set such custom activation functions for the output, whose argument are other output nodes, in Keras? If not, is there any other interface that allows this? Do you have any source to a manual?

Mr Frog
  • 296
  • 2
  • 16

1 Answers1

0

The keras-team on the GitHub answered the question about how to make a custom activation function.

There also is a question with a code with a custom activation function.

These pages may help you!


Additional comment

These pages were not enough for this question so I add the comment below;

Maybe PyTorch is better for customization than Keras. I tried to write such a network, though it is a very simple one, based on PyTorch tutorials and "Extending PyTorch with Custom Activation Functions"

I made a custom activation function in which the 1-th(counting from 0) elements of the output vector are equal to twice the 0-th elements. A very simple network with one layer was used for the training. After training, I checked that the condition was satisfied.


import torch
import matplotlib.pyplot as plt

# Define the custom activation function
# reference: https://towardsdatascience.com/extending-pytorch-with-custom-activation-functions-2d8b065ef2fa
def silu(input):
    input[:,1] = input[:,0] * 2
    return input 

class SiLU(torch.nn.Module):
    def __init__(self):
        super().__init__() # init the base class

    def forward(self, input):
        return silu(input) # simply apply already implemented SiLU


# Training
# reference: https://pytorch.org/tutorials/beginner/pytorch_with_examples.html
k = 10
x = torch.rand([k,3])
y = x * 2
model = torch.nn.Sequential(
    torch.nn.Linear(3, 3),
    SiLU()  # custom activation function
)

loss_fn = torch.nn.MSELoss(reduction='sum')
learning_rate = 1e-3
for t in range(2000):
    y_pred = model(x)
    loss = loss_fn(y_pred, y)
    if t % 100 == 99:
        print(t, loss.item())

    model.zero_grad()
    loss.backward()

    with torch.no_grad():
        for param in model.parameters():
            param -= learning_rate * param.grad

# check the behaviour
yy = model(x)  # predicted
print('ground truth')
print(y)
print('predicted')
print(yy)


# examples for the first five data
colorlist = ['#e41a1c', '#377eb8', '#4daf4a', '#984ea3', '#ff7f00']
plt.figure()
for i in range(5):
  plt.plot(y[i,:].detach().numpy(), linestyle = "solid", label = "ground truth_" + str(i), color=colorlist[i])
  plt.plot(yy[i,:].detach().numpy(), linestyle = "dotted", label = "predicted_" + str(i), color=colorlist[i])
  plt.legend()

# check if the custom activation works correctly
plt.figure()
plt.plot(yy[:,0].detach().numpy()*2, label = '0th * 2')
plt.plot(yy[:,1].detach().numpy(), label = '1th')
plt.legend()

print(yy[:,0]*2)
print(yy[:,1])
hac81acnh
  • 82
  • 7
  • I copy the code in the second link and change the custom function – hac81acnh Jul 11 '22 at 14:15
  • That custom function does not use other output nodes as input – Mr Frog Jul 11 '22 at 19:30
  • 1
    I tried to write it in Keras, but currently, the backpropagation seems not to work well. I guess you mean that you want to use the k-th element(before the activation) of the output vector at the L-th layer when calculating(=activation) the j-th element of the same vector. Is it right? – hac81acnh Jul 19 '22 at 06:15
  • This may not be the exact answer you wanted, but pytorch is good for customization. I tried to write such a network, though it is a very simple one, based on (1) https://pytorch.org/tutorials/beginner/pytorch_with_examples.html and (2) https://towardsdatascience.com/extending-pytorch-with-custom-activation-functions-2d8b065ef2fa – hac81acnh Jul 19 '22 at 06:16
  • I edited my original answer to add a pytorch code. It is not a keras implementation you wanted the most, but since you wrote "If not, is there any other interface that allows this?," so I added the answer. I hope I understood your requirement correctly and this answer helped you. – hac81acnh Jul 19 '22 at 06:48
  • If the L-th layer means the output one, then yes. – Mr Frog Jul 19 '22 at 08:42
  • does it allow to use LSTM networks? – Mr Frog Jul 19 '22 at 08:45
  • I think so. It is used as an activation function. – hac81acnh Jul 19 '22 at 23:38