Python - What value should we use for random_state in train_test_split() and in which scenario?

Question

X_train, X_test, y_train, y_test = train_test_split (X, y, test_size=0.20, random_state=0)

In above code, random_state is used 0. Why we are not using 1?

possible duplicate of https://stackoverflow.com/questions/42191717/python-random-state-in-splitting-dataset/42197534 and https://stackoverflow.com/questions/28064634/random-state-pseudo-random-numberin-scikit-learn — gireesh4manu, Jan 19 '19 at 05:56
the value of random state does not significantly impact the predictions (very negligible difference). It is just provided so as to reproduce the results again, if required, in future or on a different system/environment. It is just a seed. So if you use random_state=50 then after 7 days use the same value of random_state=50 you will get the exact same split output (even on a different env/system). — Ashu Grover, Jan 19 '19 at 05:59
Possible duplicate of [Python random state in splitting dataset](https://stackoverflow.com/questions/42191717/python-random-state-in-splitting-dataset) — desertnaut, Jan 19 '19 at 15:12

score 2 · Answer 1 · answered Jan 19 '19 at 10:27

Neither 0 or 1 for random_state have any meaning, this parameter controls the seed used by the random number generator, so setting to any value will mean that the split is random, but it will be exactly the same result for each call.

This is generally used for reproducibility, but generally you should't rely on the random_state to be a particular value.

If you set random_state to None it will always have a different random behavior each time you call train_test_split.

Python - What value should we use for random_state in train_test_split() and in which scenario?

1 Answers1