0

I have a large size CSV dataset and need to split training and testing set 77 % and 33 % respectively. Then finally I want to access each file in my local machine.

sentence
  • 8,213
  • 4
  • 31
  • 40

1 Answers1

1

Importing the required library

import math

The whole dataset

df = pd.read_csv('CTU.csv')
total_size=len(df)
train_size=math.floor(0.77*total_size)

training dataset and test dataset

train=df.head(train_size)
test=df.tail(len(df) -train_size)

Saving files

train.to_csv('train.csv')
test.to_csv('test.csv')