14

is there any way to set seed on train_test_split on python sklearn. I have set the parameter random_state to an integer, but I still can not reproduce the result.

Thanks in advance.

Bernando Purba
  • 541
  • 2
  • 9
  • 18

2 Answers2

20
from sklearn.model_selection import train_test_split
x = [k for k in range(0, 10)]
y = [k for k in range(0, 10)]
x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.4, random_state=11)
print (x_train)
x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.4, random_state=11)
print (x_train)
x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.4, random_state=11)
print (x_train)

The above code will produce the same result for x_train every time I split the data. It is possible that the randomness is in your dataframe, not train_test_split.

secretive
  • 2,032
  • 7
  • 16
1

simply in train_test_split, specify the parameter random_state=some_number_you_wan to use, like random_state=42

Antoni
  • 2,542
  • 20
  • 21