I am trying to split my data into train and test sets. The data is a Koalas dataframe. However, when I run the below code I am getting the error:
AttributeError: 'DataFrame' object has no attribute 'randomSplit'
Please find below the code I am using:
splits = Closed_new.randomSplit([0.7,0.3])
Besides I tried the usual way of splitting the data after converting the Koalas to pandas. But it takes a lot of time to get executed in Synapse. Below is the code:
state = 12
test_size = 0.30
from sklearn.model_selection import train_test_split
X_train, X_val, y_train, y_val = train_test_split(Closed_new,labels,
test_size=test_size, random_state=state)