My data currently looks like
UserID Full Name DOB EncounterID QuestionID Name Type label responses
1 John Smith 1-1-90 13 505 Intro Check Were you given any info? yes
1 John Smith 1-1-90 13 506 Care Check By using this service.. yes
1 John Smith 1-1-90 13 507 Out Check How satisfied are you? vsat
2 Jane Doe 2-2-80 14 505 Intro Check Were you given any info? no
2 Jane Doe 2-2-80 14 506 Care Check By using this service.. no
2 Jane Doe 2-2-80 14 507 Out Check How satisfied are you? unsat
My code to transform it from long to wide looks like
gwlsubset <- read.csv("subset.csv", header = TRUE)
gwlsubset
install.packages("tidyr")
library("tidyr")
subset<- pivot_wider(gwlsubset, id_cols = c( ID, full_name, date_of_birth, encounterID,
practice_name, practice_id ),
names_from = c(label),
values_from = response)
The code works perfectly fine when I run it from a subset of my data (300 records). I get something below
UserID F_Name Were you given any info? By using this service..? How satisfied are you?
1 John Smith yes yes very satisfied
2 Jane Doe no no unsatisfied
However when I run with 1,000+ records, I get the error below
Warning message:
Values are not uniquely identified; output will contain list-cols.
* Use `values_fn = list` to suppress this warning.
* Use `values_fn = length` to identify where the duplicates arise
* Use `values_fn = {summary_fun}` to summarise duplicates
I'm thinking that I have some duplicate rows that's prompting the error or it could be something else too.
My output also changes into random numbers.
UserID F_Name Were you given any info? By using this service..? How satisfied are you?
1 John Smith 1 824 38
2 Jane Doe 7 176 445
How can I edit my code to get rid of duplicates? What else do you think may be causing the error and output with numbers?
I've tried the codes in my error message but wasn't able to get anywhere for example
values_fn = {summary_fun}