Why is sklearn.metrics support value changing every time?

Question

I'm working on training a supervised learning keras model to categorize data into one of 3 categories. After training, I run this:

dataset = pandas.read_csv(filename, header=[0], encoding='utf-8-sig', sep=',')

# split X and Y (last column)
array = dataset.values
columns = array.shape[1] - 1
np.random.shuffle(array)
x_orig = array[:, 1:columns]
testy = array[:, columns]
columns -= 1

# normalize data
scaler = StandardScaler()
testx= scaler.fit_transform(x_orig)

#onehot
testy = to_categorical(testy)

# load weights
save_path = "[filepath]"
model = tf.keras.models.load_model(save_path)

# gets class breakdown
y_pred = model.predict(testx, verbose=1)
y_pred_bool = np.argmax(y_pred, axis=1)
y_true = np.argmax(testy, axis=1)
print(sklearn.metrics.precision_recall_fscore_support(y_true, y_pred))

sklearn.metrics.precision_recall_fscore_support prints, among other metrics, the support for each class. Per this link, support is the number of occurrences of each class in y_true, which is the true labels. https://scikit-learn.org/stable/modules/generated/sklearn.metrics.precision_recall_fscore_support.html

My problem: each run, support is different. I'm using the same data, and support for each class always adds up the same (but different than the total in the file – which I also don’t understand), but the number per class differs.

As an example, one run might say [16870, 16299, 7807] and the next might say [17169, 15923, 7884]. They add up the same, but each class differs.

Since my data isn't changing between runs, I'd expect support to be identical every time. Am I wrong? If not, what's going on? I've tried googling, but didn't get any useful results.

Potentially useful information: when I run sklearn.metrics.classification_report, I have the same issue, and the numbers from that match the numbers from precision_recall_fscore_support.

Sidenote: unrelated to above question, but I couldn't google-fu an answer to this one either, I hope that's ok to include here. When I run model.evaluate, part of the printout is e.g. 74us/sample. What does us/sample mean?

To answer your sidenote: 74us/sample means that it takes 74 microseconds (us) to train with one sample of your data — BStadlbauer, Feb 12 '20 at 20:52
Does this answer your question? [Random state (Pseudo-random number) in Scikit learn](https://stackoverflow.com/questions/28064634/random-state-pseudo-random-number-in-scikit-learn) — Edeki Okoh, Feb 12 '20 at 20:53
@Edeki Okoh I don't think so. I use neither train_test_split nor random_state, and random_state isn't an option in precision_recall_fscore_support. — tiffanie, Feb 12 '20 at 20:58
Well are you splitting your data the same each time? If you are getting different PRFS scores each time you must be training the data different each time, which random state solves — Edeki Okoh, Feb 12 '20 at 21:00
Also, for the question you ask, could you provide a minimum example? Because it seems a bit strange for the support to be different if `y_true` is the same every time — BStadlbauer, Feb 12 '20 at 21:01
@EdekiOkoh I'm training using code in one file then saving the model, then in another file I'm loading the saved model and testing. The only data processing happening in the 2nd file is splitting data and labels (X and Y), shuffling, and normalizing. The process is the same every time, and this problem happens even when I don't touch the code at all between runs. — tiffanie, Feb 12 '20 at 21:04
And obviously it is not working as you are saying or it would be giving you the same answer. However no one can help you if we cannot see your code. Please read [mcve]. For example you say "shuffling" which means that you are changing the data that gets read into the model each time. This is most likely why you are getting different answers. — Edeki Okoh, Feb 12 '20 at 21:08
@EdekiOkoh Thanks for clarifying minimum example. I've included my total code. The shuffling happens after I read all the data in. Removing it seems to have removed the problem. Why would shuffling cause it when it should be the same data just in a different order? — tiffanie, Feb 12 '20 at 21:14

Edeki Okoh · Accepted Answer · 2020-02-12T21:39:22.203

0

Add:

np.random.seed(42)

before you shuffle the array at

np.random.shuffle(array)

The reason for this is without seeding np.shuffle will create a different result each time. Thus when you feed the array into the model it will return a different result. Seeding allows you to shuffle it the same each time, thus creating reproducible results.

Or you can not shuffle and get the same array each time to feed into the model. Either or both methods will ensure reproducibility within the model.

edited Feb 12 '20 at 21:39

answered Feb 12 '20 at 21:17

Edeki Okoh

1,786
15
27

Could you explain please why what I have doesn't work? It's the exact same data, just shuffled, so I would think the count for each class would be the same each time. I understand (now) how seed will make the shuffle the same for each run, thanks. – tiffanie Feb 12 '20 at 21:21
1

The short answer is because shuffle randomly chooses 2 elements to exchange, so you feed different data to the model each time so the model learns differently. The long answer is ["Black Box."](https://www.thelancet.com/journals/lanres/article/PIIS2213-2600(18)30425-9/fulltext) – Edeki Okoh Feb 12 '20 at 21:25

Why is sklearn.metrics support value changing every time?

1 Answers1