I have a dataframe with participants' judgments for two texts. Suppose each text has a correct answer and an identifier, and each text is judged multple times.
set.seed(123)
wide_df = data.frame('participant_id' = LETTERS[1:12]
, 'judgment_1' = round(rnorm(12)*100)
, 'correct_1' = round(rnorm(12)*100)
, 'text_id_1' = sample(1:12, 12, replace = F)
, 'judgment_2' = round(rnorm(12)*100)
, 'correct_2' = round(rnorm(12)*100)
, 'text_id_2' = sample(13:24, 12, replace = F)
)
So that:
participant_id judgment_1 correct_1 text_id_1 judgment_2 correct_2 text_id_2
1 A -56 40 4 43 -127 17
2 B -23 11 10 -30 217 14
3 C 156 -56 1 90 121 22
4 D 7 179 12 88 -112 15
5 E 13 50 7 82 -40 13
...
I would want to convert this to the long format with the columns:
participant_id text_id judgment correct
A 4 -56 40
A 17 43 127
...
I found and followed the SO advice here:
wide_df %>%
gather(v, value, judgment_1:text_id_2) %>%
separate(v, c("var", "col")) %>%
arrange(participant_id) %>%
spread(col, value)
But that way of reshaping returns the error Error: Duplicate identifiers for rows (3, 6), (9, 12)
I think I do something conceptually wrong but can't quite find it. Where is my mistake? Thanks!