How to design a neural network to predict arrays from arrays

Question

I am trying to design a neural network to predict an array of the smooth underlying function from a dataset array with gaussian noise included. I have created a training and data set of 10000 arrays combined. Now I am trying to predict the array values for the actual function but it seems to fail and the accuracy isn't good either. Can someone guide me how to further improve my model to get better accuracy and be able to predict good data. My code used is below:

for generating test and training data:

noisy_data = []
pure_data =[]
time = np.arange(1,100)
for i in tqdm(range(10000)):
    array = []
    noise = np.random.normal(0,1/10,99)
    for j in range(1,100):
        array.append( np.log(j))
    array = np.array(array)
    pure_data.append(array)
    noisy_data.append(array+noise)
    

pure_data=np.array(pure_data)
noisy_data=np.array(noisy_data)
    
print(noisy_data.shape)
print(pure_data.shape)

training_size=6000


x_train = noisy_data[:training_size]
y_train = pure_data[:training_size]
x_test = noisy_data[training_size:]
y_test = pure_data[training_size:]
print(x_train.shape)

My model:

model = tf.keras.models.Sequential()
model.add(tf.keras.layers.Flatten(input_shape=(99,)))
model.add(tf.keras.layers.Dense(768, activation=tf.nn.relu))
model.add(tf.keras.layers.Dense(768, activation=tf.nn.relu))
model.add(tf.keras.layers.Dense(99, activation=tf.nn.softmax))

model.compile(optimizer = 'adam',
         loss = 'categorical_crossentropy',
         metrics = ['accuracy'])

model.fit(x_train, y_train, epochs = 20)

Outcome of bad accuracy:

Epoch 1/20
125/125 [==============================] - 2s 16ms/step - loss: 947533.1875 - accuracy: 0.0000e+00
Epoch 2/20
125/125 [==============================] - 2s 15ms/step - loss: 9756863.0000 - accuracy: 0.0000e+00
Epoch 3/20
125/125 [==============================] - 2s 16ms/step - loss: 30837548.0000 - accuracy: 0.0000e+00
Epoch 4/20
125/125 [==============================] - 2s 15ms/step - loss: 63707028.0000 - accuracy: 0.0000e+00
Epoch 5/20
125/125 [==============================] - 2s 16ms/step - loss: 107545128.0000 - accuracy: 0.0000e+00
Epoch 6/20
125/125 [==============================] - 1s 12ms/step - loss: 161612192.0000 - accuracy: 0.0000e+00
Epoch 7/20
125/125 [==============================] - 1s 12ms/step - loss: 225245360.0000 - accuracy: 0.0000e+00
Epoch 8/20
125/125 [==============================] - 1s 12ms/step - loss: 297850816.0000 - accuracy: 0.0000e+00
Epoch 9/20
125/125 [==============================] - 1s 12ms/step - loss: 378894176.0000 - accuracy: 0.0000e+00
Epoch 10/20
125/125 [==============================] - 1s 12ms/step - loss: 467893216.0000 - accuracy: 0.0000e+00
Epoch 11/20
125/125 [==============================] - 2s 17ms/step - loss: 564412672.0000 - accuracy: 0.0000e+00
Epoch 12/20
125/125 [==============================] - 2s 15ms/step - loss: 668056384.0000 - accuracy: 0.0000e+00
Epoch 13/20
125/125 [==============================] - 2s 13ms/step - loss: 778468480.0000 - accuracy: 0.0000e+00
Epoch 14/20
125/125 [==============================] - 2s 18ms/step - loss: 895323840.0000 - accuracy: 0.0000e+00
Epoch 15/20
125/125 [==============================] - 2s 13ms/step - loss: 1018332672.0000 - accuracy: 0.0000e+00
Epoch 16/20
125/125 [==============================] - 1s 11ms/step - loss: 1147227136.0000 - accuracy: 0.0000e+00
Epoch 17/20
125/125 [==============================] - 2s 12ms/step - loss: 1281768448.0000 - accuracy: 0.0000e+00
Epoch 18/20
125/125 [==============================] - 2s 14ms/step - loss: 1421732608.0000 - accuracy: 0.0000e+00
Epoch 19/20
125/125 [==============================] - 1s 11ms/step - loss: 1566927744.0000 - accuracy: 0.0000e+00
Epoch 20/20
125/125 [==============================] - 1s 10ms/step - loss: 1717172480.0000 - accuracy: 0.0000e+00

and the prediction code I use:

model.predict([noisy_data[0]])

This throws back the error:

WARNING:tensorflow:Model was constructed with shape (None, 99) for input Tensor("flatten_5_input:0", shape=(None, 99), dtype=float32), but it was called on an input with incompatible shape (None, 1).


ValueError: Input 0 of layer dense_15 is incompatible with the layer: expected axis -1 of input shape to have value 99 but received input with shape [None, 1]

Thanks for the comment, I have implemented the solutions, however, in order to decide the best solution I will wait a bit given certain solutions worked in specific further example problems of similar nature whereas other solutions had a stable and good performance in all cases. So I am testing a bit before choosing a solution. — user142756, Dec 17 '20 at 18:13

Akshay Sehgal · Answer 1 · 2020-12-09T23:52:07.040

What you are trying to build is called a De-noising autoencoder. The goal here is to be able to reconstruct a noise-less sample by artificially introducing noise in a dataset, feed it to an encoder, then try to regenerate it without noise using a decoder.

This can be done with any form of data, including image and text.

I would recommend reading more about this. There are various concepts that ensure proper training of the model including understanding the requirement of a bottleneck in the middle to ensure proper compression and information loss, else the model just learns to multiply by 1 and returns the output.

Here is a sample piece of code. You can read more about this type of architecture here, written by the author of Keras himself.

from tensorflow.keras import layers, Model, utils, optimizers

#Encoder
enc = layers.Input((99,))
x = layers.Dense(128, activation='relu')(enc)
x = layers.Dense(56, activation='relu')(x)
x = layers.Dense(8, activation='relu')(x) #Compression happens here

#Decoder
x = layers.Dense(8, activation='relu')(x)
x = layers.Dense(56, activation='relu')(x)
x = layers.Dense(28, activation='relu')(x)
dec = layers.Dense(99)(x)

model = Model(enc, dec)

opt = optimizers.Adam(learning_rate=0.01)

model.compile(optimizer = opt, loss = 'MSE')

model.fit(x_train, y_train, epochs = 20)

Please beware, Autoencoders assume that the input data has some underlying structure to it and therefore can be compressed into a lower-dimensional space, which the decoder can use to regenerate the data. Using randomly generated sequences as data may not show any good results because its compression is not going to work without massive loss of information which itself has no structure to it.

As most of the other answers suggest, you are not using the activations properly. Since the goal is to regenerate a 99-dimensional vector with continuous values, it would make sense NOT to use sigmoid, instead, work with tanh as it compresses (-1,1) or no final layer activation, and not gates (0-1) the values.

Here is a Denoising autoencoder with conv1d and deconv1d layers. The issue here is that the input is too simple. See if you can generate more complex parametric functions for input data.

from tensorflow.keras import layers, Model, utils, optimizers

#Encoder with conv1d
inp = layers.Input((99,))
x = layers.Reshape((99,1))(inp)
x = layers.Conv1D(5, 10)(x)
x = layers.MaxPool1D(10)(x)
x = layers.Flatten()(x)
x = layers.Dense(4, activation='relu')(x) #<- Bottleneck!

#Decoder with Deconv1d
x = layers.Reshape((-1,1))(x)
x = layers.Conv1DTranspose(5, 10)(x)
x = layers.Conv1DTranspose(2, 10)(x)
x = layers.Flatten()(x)
out = layers.Dense(99)(x)

model = Model(inp, out)

opt = optimizers.Adam(learning_rate=0.001)
model.compile(optimizer = opt, loss = 'MSE')
model.fit(x_train, y_train, epochs = 10, validation_data=(x_test, y_test))

Epoch 1/10
188/188 [==============================] - 1s 7ms/step - loss: 2.1205 - val_loss: 0.0031
Epoch 2/10
188/188 [==============================] - 1s 5ms/step - loss: 0.0032 - val_loss: 0.0032
Epoch 3/10
188/188 [==============================] - 1s 5ms/step - loss: 0.0032 - val_loss: 0.0030
Epoch 4/10
188/188 [==============================] - 1s 5ms/step - loss: 0.0031 - val_loss: 0.0029
Epoch 5/10
188/188 [==============================] - 1s 5ms/step - loss: 0.0030 - val_loss: 0.0030
Epoch 6/10
188/188 [==============================] - 1s 5ms/step - loss: 0.0029 - val_loss: 0.0027
Epoch 7/10
188/188 [==============================] - 1s 5ms/step - loss: 0.0028 - val_loss: 0.0029
Epoch 8/10
188/188 [==============================] - 1s 5ms/step - loss: 0.0028 - val_loss: 0.0025
Epoch 9/10
188/188 [==============================] - 1s 5ms/step - loss: 0.0028 - val_loss: 0.0025
Epoch 10/10
188/188 [==============================] - 1s 5ms/step - loss: 0.0026 - val_loss: 0.0024

utils.plot_model(model, show_layer_names=False, show_shapes=True)

Nikaido · Answer 2 · 2020-12-09T20:02:23.563

To make the model work you need to do some changes

First of all, your problem is a regression problem not a classification problem. So you need to change the loss from crossentropy to the mean squared error (mse)
Then you need to change the output of your last layer to output raw values

EDIT: On second thought because I saw wrong the kind of input, as suggested by @desertnaut it's better to use the raw output of the model

Anyway it is better to use an auto-encoder as suggested by @AkshaySehgal because in this way you are forcing the de-noising, making the net learning in the compressed space the underlying function

model = tf.keras.models.Sequential()
model.add(tf.keras.layers.Flatten(input_shape=(99,)))
model.add(tf.keras.layers.Dense(768, activation=tf.nn.relu))
model.add(tf.keras.layers.Dense(768, activation=tf.nn.relu))
model.add(tf.keras.layers.Dense(99))

model.compile(optimizer = 'adam',
         loss = 'mean_squared_error',
         metrics = ['mse'])

model.fit(x_train, y_train, epochs = 20)

output:

Epoch 1/20
188/188 [==============================] - 2s 9ms/step - loss: 28.7281 - mse: 28.7281
Epoch 2/20
188/188 [==============================] - 2s 9ms/step - loss: 1.6866 - mse: 1.6866
Epoch 3/20
188/188 [==============================] - 2s 9ms/step - loss: 0.5031 - mse: 0.5031
Epoch 4/20
188/188 [==============================] - 2s 9ms/step - loss: 0.3126 - mse: 0.3126
Epoch 5/20
188/188 [==============================] - 2s 9ms/step - loss: 0.2186 - mse: 0.2186
Epoch 6/20
188/188 [==============================] - 2s 9ms/step - loss: 0.1420 - mse: 0.1420
Epoch 7/20
188/188 [==============================] - 2s 9ms/step - loss: 0.1334 - mse: 0.1334
Epoch 8/20
188/188 [==============================] - 2s 9ms/step - loss: 0.1193 - mse: 0.1193
Epoch 9/20
188/188 [==============================] - 2s 9ms/step - loss: 0.1174 - mse: 0.1174
Epoch 10/20
188/188 [==============================] - 2s 9ms/step - loss: 0.0813 - mse: 0.0813
Epoch 11/20
188/188 [==============================] - 2s 9ms/step - loss: 0.0334 - mse: 0.0334
Epoch 12/20
188/188 [==============================] - 2s 9ms/step - loss: 0.0592 - mse: 0.0592
Epoch 13/20
188/188 [==============================] - 2s 9ms/step - loss: 0.0162 - mse: 0.0162
Epoch 14/20
188/188 [==============================] - 2s 9ms/step - loss: 0.0255 - mse: 0.0255
Epoch 15/20
188/188 [==============================] - 2s 9ms/step - loss: 0.0208 - mse: 0.0208
Epoch 16/20
188/188 [==============================] - 2s 9ms/step - loss: 0.0365 - mse: 0.0365
Epoch 17/20
188/188 [==============================] - 2s 9ms/step - loss: 0.0236 - mse: 0.0236
Epoch 18/20
188/188 [==============================] - 2s 9ms/step - loss: 0.0155 - mse: 0.0155
Epoch 19/20
188/188 [==============================] - 2s 9ms/step - loss: 0.0204 - mse: 0.0204
Epoch 20/20
188/188 [==============================] - 2s 9ms/step - loss: 0.0145 - mse: 0.0145

<tensorflow.python.keras.callbacks.History at 0x7f60d19256d8>

If you need I also built the model in keras using colab. You can check the results directly from my notebook

Including `metrics = ['mse']` in the model compilation is redundant; and why the sigmoid in the last layer? — desertnaut, Dec 09 '20 at 16:42
Also, apart from the sigmoid, the x*100 doesnt make sense either. — Akshay Sehgal, Dec 09 '20 at 16:44
Here is the problem with your architecture. If you are trying to regenerate the input like this, the model will just learn now to `multiple by 1`.. `input*1=output`. You NEED to introduce a bottleneck where the input is compressed enough. And then regenerate it to its original shape, which is what will make the model actually learn to regenerate the sequence without noise. — Akshay Sehgal, Dec 09 '20 at 16:47
@AkshaySehgal I guess you are right. It is true that in this way it is more probable that the model will not learn well the underlying function, but it is a raw start, for what OP was asking. Forcing the model to compress the data will force the net to learn in a better way the underlying function — Nikaido, Dec 09 '20 at 16:54
exactly, also, since OP is using randomly generated inputs, i dont think there would be an underlying function to begin with. So i guess this is just a toy example which OP is working with. — Akshay Sehgal, Dec 09 '20 at 16:55
@AkshaySehgal Yes I agree completly. I am going to upvote your answer, the which is accurate and makes more sense to me :) — Nikaido, Dec 09 '20 at 16:58
Thanks for the credit regarding the output values; I am puzzled as to why, while we propose practically the same things, your answer is at +2 while mine at -2... — desertnaut, Jan 06 '21 at 12:01
@desertnaut I don't use downvote, pratically never! You can check my stats :) anyway I'll upvote it, because I don't think it deserves downvotes — Nikaido, Jan 06 '21 at 16:05
@Nikaido thank you, but I meant it in general; in no way I was implying that you downvoted my post, and sincere apologies if I gave that impression. — desertnaut, Jan 06 '21 at 16:11
@desertnaut no need for apologies, there has been a misunderstanding from my side :) keep up the good work — Nikaido, Jan 06 '21 at 16:59

score 2 · Accepted Answer · answered Dec 09 '20 at 16:35

Looking at your y data:

y_train[0]

array([0.        , 0.69314718, 1.09861229, 1.38629436, 1.60943791,
       1.79175947, 1.94591015, 2.07944154, 2.19722458, 2.30258509,
       2.39789527, 2.48490665, 2.56494936, 2.63905733, 2.7080502 ,
       2.77258872, 2.83321334, 2.89037176, 2.94443898, 2.99573227,
       3.04452244, 3.09104245, 3.13549422, 3.17805383, 3.21887582,
       3.25809654, 3.29583687, 3.33220451, 3.36729583, 3.40119738,
       3.4339872 , 3.4657359 , 3.49650756, 3.52636052, 3.55534806,
       3.58351894, 3.61091791, 3.63758616, 3.66356165, 3.68887945,
       3.71357207, 3.73766962, 3.76120012, 3.78418963, 3.80666249,
       3.8286414 , 3.8501476 , 3.87120101, 3.8918203 , 3.91202301,
       3.93182563, 3.95124372, 3.97029191, 3.98898405, 4.00733319,
       4.02535169, 4.04305127, 4.06044301, 4.07753744, 4.09434456,
       4.11087386, 4.12713439, 4.14313473, 4.15888308, 4.17438727,
       4.18965474, 4.20469262, 4.21950771, 4.2341065 , 4.24849524,
       4.26267988, 4.27666612, 4.29045944, 4.30406509, 4.31748811,
       4.33073334, 4.34380542, 4.35670883, 4.36944785, 4.38202663,
       4.39444915, 4.40671925, 4.41884061, 4.4308168 , 4.44265126,
       4.4543473 , 4.46590812, 4.47733681, 4.48863637, 4.49980967,
       4.51085951, 4.52178858, 4.53259949, 4.54329478, 4.55387689,
       4.56434819, 4.57471098, 4.58496748, 4.59511985])

it would seem that you are in a regression setting, and not a classification one.

So, you need to change the last layer of your model to

model.add(tf.keras.layers.Dense(99)) # default linear activation

and compile it as

model.compile(optimizer = 'adam', loss = 'mse')

(notice that accuracy is meaningless in regression problems).

With these changes, fitting your model for 5 epochs gives now reasonable loss values:

model.fit(x_train, y_train, epochs = 5)

Epoch 1/5
188/188 [==============================] - 0s 2ms/step - loss: 0.2120
Epoch 2/5
188/188 [==============================] - 0s 2ms/step - loss: 4.0999e-04
Epoch 3/5
188/188 [==============================] - 0s 2ms/step - loss: 4.1783e-04
Epoch 4/5
188/188 [==============================] - 0s 2ms/step - loss: 4.2255e-04
Epoch 5/5
188/188 [==============================] - 0s 2ms/step - loss: 4.9760e-04

and it certainly seems you don't need 20 epochs.

For predicting single values, you need to reshape them as follows:

model.predict(np.array(noisy_data[0]).reshape(1,-1))

# result:

array([[-0.02887887,  0.67635924,  1.1042297 ,  1.4030693 ,  1.5970025 ,
         1.8026372 ,  1.9588575 ,  2.0648997 ,  2.202754  ,  2.3088624 ,
         2.400107  ,  2.4935524 ,  2.560785  ,  2.658005  ,  2.714249  ,
         2.7735658 ,  2.8429594 ,  2.8860366 ,  2.9135942 ,  2.991392  ,
         3.0119512 ,  3.1059306 ,  3.1467025 ,  3.1484323 ,  3.2273414 ,
         3.2722526 ,  3.2814353 ,  3.3600745 ,  3.3591018 ,  3.3908122 ,
         3.4431438 ,  3.4897916 ,  3.5229044 ,  3.542718  ,  3.5617661 ,
         3.5660467 ,  3.622283  ,  3.614976  ,  3.6565022 ,  3.6963918 ,
         3.7061958 ,  3.7615037 ,  3.7564514 ,  3.7682133 ,  3.8250954 ,
         3.831929  ,  3.86098   ,  3.8959084 ,  3.8967183 ,  3.9016035 ,
         3.9568343 ,  3.9597993 ,  4.0028276 ,  3.9931173 ,  3.9887471 ,
         4.0221996 ,  4.021959  ,  4.048805  ,  4.069759  ,  4.104507  ,
         4.1473804 ,  4.167117  ,  4.1388593 ,  4.148655  ,  4.175832  ,
         4.1865892 ,  4.2039223 ,  4.2558513 ,  4.237947  ,  4.257041  ,
         4.2507076 ,  4.2826586 ,  4.2916007 ,  4.2920256 ,  4.304987  ,
         4.3153067 ,  4.3575797 ,  4.347109  ,  4.3662906 ,  4.396843  ,
         4.36556   ,  4.3965526 ,  4.421436  ,  4.433974  ,  4.424191  ,
         4.4379086 ,  4.442377  ,  4.4937015 ,  4.468969  ,  4.506153  ,
         4.515915  ,  4.524729  ,  4.53225   ,  4.5434146 ,  4.561402  ,
         4.582401  ,  4.5856013 ,  4.544302  ,  4.6128435 ]],
      dtype=float32)

I believe it just needs a single epoch because the model is learning to multiple input by 1, not regenerate the inputs without noise. Anyways your suggestions are on point. — Akshay Sehgal, Dec 09 '20 at 17:07

How to design a neural network to predict arrays from arrays

3 Answers3