I have programmed a straightforward neural network with Tensorflow. The goal is to get two integers a
and b
and a function f(a, b) = (a + b) % 67
. The problem is that the model achieves an accuracy of 1%, which is incredibly low. Code:
# Imports.
from random import randint
import numpy as np
import tensorflow as tf
# Modulus function.
f = lambda a, b: (a + b) % 67
# Build two separated random lists of 150K integers between 0 and 1K, both included.
a = [randint(0, 1000) for _ in range(150_000)]
b = [randint(0, 1000) for _ in range(150_000)]
# Split a, b pairs in 70% training...
X_train = np.array(list(zip(a, b))[:int(len(a) * 0.7)])
y_train = np.array([f(a, b) for a, b in list(zip(a, b))[:int(len(a) * 0.7)]]).reshape(-1, 1)
# ... and 30% for testing purposes.
X_test = np.array(list(zip(a, b))[int(len(a) * 0.7):])
y_test = np.array([f(a, b) for a, b in list(zip(a, b))[int(len(a) * 0.7):]]).reshape(-1, 1)
# Build a 3 layer model.
model = tf.keras.models.Sequential()
model.add(tf.keras.layers.Dense(128, activation='relu', input_shape=(2, )))
model.add(tf.keras.layers.Dense(128, activation='relu'))
model.add(tf.keras.layers.Dense(1, activation='linear'))
# Compile it.
model.compile(optimizer='adam', loss='mae', metrics=['accuracy'])
# Fit the data as usual.
model.fit(X_train, y_train, epochs=10, batch_size=32)
# Try to predict case of f(13209248, 2319209312) = 21. **Result is "[[-3916619.]]"**.
print(model.predict(np.array([[13209248, 2319209312]])))
# Show model's accuracy.
model.evaluate(X_test, y_test, verbose=1)
Question 1: From what I've seen in Tensorflow's definition of accuracy, if the result is not exact it will reduce the accuracy even if it's quite close to it. Should this metric then be discarded for numerical, non-categorical outputs or is there a way to adjust it?
Question 2: What is going on there to not model correctly? Are more layers/neurons needed? More epochs? I'm trying to replicate results gotten by Google AI's team: https://pair.withgoogle.com/explorables/grokking/. I tried with their 24-neurons model but didn't work much better than this 128-128-1-neurons one.