I need a randomised split for my data set into training, validation and test set, such as shown in this post (R: How to split a data frame into training, validation, and test sets?), but it needs to be linked to the splitting subject ID's randomly, not the whole data frame.
When applying the code answered to that question it splits my data frame completely randomly, but I have stacked ID's and need them to stay together or else one subjects data will be distributed over the different sets.
Sorry, if this sounds a bit confusing. Here my data to explain the issue:
df <- c(Contact.ID, Date.Time, Age, Gender, Attendance)
Contact.ID Date.Time Age Gender Attendance
1 A 2012-07-06 18:54:48 37 Male 30
2 A 2012-07-06 20:50:18 37 Male 30
3 A 2012-08-14 20:18:44 37 Male 30
4 B 2012-03-15 16:58:15 27 Female 40
5 B 2012-04-18 10:57:02 27 Female 40
6 B 2012-04-18 17:31:22 27 Female 40
7 B 2012-04-18 18:37:00 27 Female 40
8 C 2013-10-22 17:46:07 40 Male 5
9 C 2013-10-27 11:21:00 40 Male 5
10 D 2012-07-28 14:48:33 20 Female 12
If I split this data randomly, subject A's entries could, for instance, have two in my test set and one in my validation set. But I would need a random split of different ID's not random split of the whole data frame and I can not figure out how to connect these.