I was curious and I did some experimental stuff, based on the comment of Daniel Möller in this thread in tensorflow 2.0 with keras:
Update: Make the order not matter anymore:
To make the order not matty anymore, we need to remove the order information from our dataset. To do this, we first convert it to a one-hot vector, then we take the max() value to squash the dimensions into 3 again:
x_no_order = tf.keras.utils.to_categorical(x)
This gives us a one-hot vector looking like this:
array([[[0., 1., 0., 0., 0., 0., 0.],
[0., 0., 1., 0., 0., 0., 0.],
[0., 0., 0., 1., 0., 0., 0.]],
[[0., 1., 0., 0., 0., 0., 0.],
[0., 0., 1., 0., 0., 0., 0.],
[0., 0., 0., 0., 1., 0., 0.]],
[[0., 1., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 1., 0., 0.],
[0., 0., 0., 0., 0., 1., 0.]],
[[0., 0., 0., 1., 0., 0., 0.],
[0., 0., 0., 0., 1., 0., 0.],
[0., 0., 0., 0., 0., 1., 0.]],
[[0., 1., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 1., 0.],
[0., 0., 0., 0., 0., 0., 1.]]], dtype=float32)
Taking the np.max()
from that vector gives us a vector, that only knows about which numbers occur, without any information about the position, looking like this:
x_no_order.max(axis=1)
array([[0., 1., 1., 1., 0., 0., 0.],
[0., 1., 1., 0., 1., 0., 0.],
[0., 1., 0., 0., 1., 1., 0.],
[0., 0., 0., 1., 1., 1., 0.],
[0., 1., 0., 0., 0., 1., 1.]], dtype=float32)
First create the dataframe and create the training data
Thats a multiclass-classification task, so I use the tokenizer (there are for sure better approaches, since its rather for text)
import tensorflow as tf
import numpy as np
import pandas as pd
df = pd.DataFrame({
"problems": [[1,2,3], [1,2,4], [1,4,5], [3,4,5], [1,5,6]],
"results": ["A", "A", "C", "C", "A"]
})
x = df['problems']
y = df['results']
tokenizer = tf.keras.preprocessing.text.Tokenizer()
tokenizer.fit_on_texts(y)
y_train = tokenizer.texts_to_sequences(y)
x = np.array([np.array(i,dtype=np.int32) for i in x])
y_train = np.array(y_train, dtype=np.int32)
**Then create the model **
input_layer = tf.keras.layers.Input(shape=(3))
dense_layer = tf.keras.layers.Dense(6)(input_layer)
dense_layer2 = tf.keras.layers.Dense(20)(dense_layer)
out_layer = tf.keras.layers.Dense(3, activation="softmax")(dense_layer2)
model = tf.keras.Model(inputs=[input_layer], outputs=[out_layer])
model.compile(optimizer="Nadam", loss="sparse_categorical_crossentropy",metrics=["accuracy"])
Train the model by fitting it
hist = model.fit(x,y_train, epochs=100)
Then, as based on Daniels comment, you take the sequence you want to test and mask out certain values, to test their influence
arr =np.reshape(np.array([1,2,3]), (1,3))
print(model.predict(arr))
arr =np.reshape(np.array([0,2,3]), (1,3))
print(model.predict(arr))
arr =np.reshape(np.array([1,0,3]), (1,3))
print(model.predict(arr))
arr =np.reshape(np.array([1,2,0]), (1,3))
print(model.predict(arr))
This will print this result, have in mind that since y starts at one, the first value is a placeholder, so the second value stands for "A"
[[0.00441748 0.7981055 0.19747704]]
[[0.00103579 0.9863035 0.01266076]]
[[0.0031549 0.9953074 0.00153765]]
[[0.01631758 0.00633342 0.977349 ]]
There we can see, that in the first place A is correctly predicted by 0.7981..
When the of [1,2,3] we change 3 to 0, so [1,2,0] we see that the model suddenly predicts "C". So the influence of 3 on position 3 is the biggest. Putting that in a function, you can use all training data you have and build statistic metrics to analyze that further.
This is just a very simple approach, but keep in mind that it is a big research field called sensitivity analysis. You might want to have a deeper look at that topic, if you are interested.