2

How do I best make %80 train, %10 validation, and %10 percent test splits using train_test_split in Python? Is there a common way to visualize this split once created?

from sklearn.model_selection import train_test_split

# Splitting the data by a percentage
train_data, test_data = train_test_split(mid_prices, train_size=0.8, test_size=0.2, shuffle=False)
iceAtNight7
  • 194
  • 1
  • 2
  • 10
  • 3
    Does this answer your question? [How to split data into 3 sets (train, validation and test)?](https://stackoverflow.com/questions/38250710/how-to-split-data-into-3-sets-train-validation-and-test) – enzo Jun 20 '21 at 21:52
  • Thank you @enzo it sort of answers my question but I was trying to do it yes it sort of does but I am still unsure. Using this the solution I have produced is the following! Do you have any thoughts? train_data, test_data = train_test_split(mid_prices, test_size=0.1, shuffle=False, random_state=42) train_data, validation_data = train_test_split(X_train, y_train, test_size=0.111, shuffle=False, random_state=42) # 0.111 x 0.9 = 0.0999 or 9.99% – iceAtNight7 Jun 20 '21 at 22:00

1 Answers1

2

Initially divide the data into 80% and 20%. 80% for training and remaining 20% for test and validation.

train_data, rest_data = train_test_split(mid_prices, train_size=0.8, shuffle=False)

Now you can split the remaining data into 50% each to have 10% validation and 10% test.

validation_data, test_data = train_test_split(rest_data, test_size=0.5, shuffle=False)

Yogesh Bhandari
  • 460
  • 2
  • 9