I am building a python application in which i want to forecast the values of PM2.5 over a month. I am using polynomial regression and I have trained the algorithm to split data into 30%test data and 70%train data. I am using this line of code to train the algorithm:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42,shuffle=True)
But i have noticed that if i give the random_state different integers, the mean squared error differs and also the accuracy of the forecast. How can I find the optimal parameters for the train_test_split method so that the forecast has the most accuracy?