I'm dealing with a dataset where I have students ratings of teachers. Some students rated the same teacher more than once. What I would like to do with the data is to subset it with the following criteria:
1) Keep any unique student Ids and ratings
2) In cases where students rated a teacher twice keep only 1 rating, but to select which rating to keep randomly.
3) If possible I'd like to be able to run the code in a munging script at the top of every analysis file and ensure that the dataset created is exaclty the same for each analysis (set seed?).
# data
student.id <- c(1,1,2,3,3,4,5,6,7,7,7,8,9)
teacher.id <- c(1,1,1,1,1,2,2,2,2,2,2,2,2)
rating <- c(100,99,89,100,99,87,24,52,100,99,89,79,12)
df <- data.frame(student.id,teacher.id,rating)
Thanks for any guidance for how to move forward.