Given below is the pseudo data -training data
I am implementing a random forest algorithm for the binary classification in R.
rf=randomForest(Default~.,data=traindata,ntree=300,mtry=18,importance=TRUE)
I want to fit the model on individual personalid.
Like for personid 112 a prediction of either 1 or 0.
Right now my model takes in the entire data and gives different predictions for each month. I want to get predictions based on personid.
A single prediction for a single id not for different months.
My total number of personid is 265.
will using group_by()
from dplyr package help me?.
As the number of personid is large, also how will I predict on the new data?.
*condition I cannot average the data to flatten it out as this is a financial data.