I have data of how team members rated one another, from multiple teams. Each person has their own id number but also a team and rater number within the team like so:
StudyID TeamID CATMERater Rated Rating
(int) (int) (int) (dbl) (dbl)
1 2930 551 1 1 5.000000 #How rater 1 rated 1 (themselves)
2 2938 551 2 1 3.800000 #How rater 2 rated 1
3 2939 551 3 1 5.000000 #How rater 3 rated 1
4 2930 551 1 2 3.666667 #How rater 1 rated 2
5 2938 551 2 2 4.000000 #...
6 2939 551 3 2 3.866667
...
and so on. I got this format using tidyr
and am trying to get a new column of the StudyID where the TeamID and person being rated are the same. This is what I've tried, but doesn't work because I'm not sure how to reference the same table:
edges %>% mutate(RatedStudyID = filter(edges, TeamID == TeamID & Rated == CATMERater))
Hopefully this makes sense, but I'd appreciate a recommendation to get headed in the right direction. If it is something with left_join
how do I say where TeamID == TeamID
?
Here's what I'd like to see in the end (mostly the last column though):
StudyID TeamID CATMERater Rated Rating RatedStudyID
(int) (int) (int) (dbl) (dbl)
1 2930 551 1 1 5.000000 2930
2 2938 551 2 1 3.800000 2930
3 2939 551 3 1 5.000000 2930
4 2930 551 1 2 3.666667 2938
5 2938 551 2 2 4.000000 2938
6 2939 551 3 2 3.866667 2938
...
dput results per @akron that gives an error:
structure(list(StudyID = c(2930L, 2938L, 2939L, 2930L, 2938L,
2939L, 2930L, 2938L, 2939L, 2930L, 2938L, 2939L, 2930L, 2938L,
2939L, 2930L, 2938L, 2939L, 2920L, 2941L, 2989L, 2920L, 2941L,
2989L, 2920L, 2941L, 2989L, 2920L, 2941L, 2989L, 2920L, 2941L,
2989L, 2920L, 2941L, 2989L, 2922L, 2924L, 2943L, 2922L, 2924L,
2943L, 2922L, 2924L, 2943L, 2922L, 2924L, 2943L, 2922L, 2924L
), TeamID = c(551L, 551L, 551L, 551L, 551L, 551L, 551L, 551L,
551L, 551L, 551L, 551L, 551L, 551L, 551L, 551L, 551L, 551L, 552L,
552L, 552L, 552L, 552L, 552L, 552L, 552L, 552L, 552L, 552L, 552L,
552L, 552L, 552L, 552L, 552L, 552L, 553L, 553L, 553L, 553L, 553L,
553L, 553L, 553L, 553L, 553L, 553L, 553L, 553L, 553L), CATMERater = c(1L,
2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L,
3L, 2L, 1L, 3L, 2L, 1L, 3L, 2L, 1L, 3L, 2L, 1L, 3L, 2L, 1L, 3L,
2L, 1L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L,
2L), Rated = c(1, 1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 4, 5, 5, 5, 6,
6, 6, 1, 1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 4, 5, 5, 5, 6, 6, 6, 1,
1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 4, 5, 5), Rating = c(5, 3.8, 5,
3.66666666666667, 4, 3.86666666666667, 4.53333333333333, 4, 4.8,
NaN, NaN, NaN, NaN, NaN, NaN, NA, NA, NA, 3.93333333333333, 5,
5, 5, 5, 5, 5, 5, 5, NaN, NaN, NaN, NaN, NaN, NaN, NA, NA, NA,
4, 4, 4, 4, 4, 4, 4, 3.86666666666667, 4, NaN, NaN, NaN, NaN,
NaN)), .Names = c("StudyID", "TeamID", "CATMERater", "Rated",
"Rating"), class = c("tbl_df", "data.frame"), row.names = c(NA,
-50L))