I have a survey form and I need to group this dataset to a single row, but I have some problems with the use of spread and group.
My dataset has the next format: data
country date_ user_id int_id user_name ext_name q_order questions answers
AR 2019 AR-100 XP200 jhon foo damian, khon 1 Question1 … yes
AR 2019 AR-100 XP200 jhon foo damian, khon 2 Question2 … 0
AR 2019 AR-100 XP200 jhon foo damian, khon 3 Question3 … no apply
AR 2019 AR-100 XP200 jhon foo damian, khon 4 Question4 … 0
AR 2019 AR-100 XP200 jhon foo damian, khon 5 Question5 … 0
AR 2019 AR-100 XP200 jhon foo damian, khon 6 Question6 … yes
US 2018 US-100 PP300 Peter fields jhon voigh 1 Question1 … no
US 2018 US-100 PP300 Peter fields jhon voigh 2 Question2 … 0
US 2018 US-100 PP300 Peter fields jhon voigh 3 Question3 … yes apply
US 2018 US-100 PP300 Peter fields jhon voigh 4 Question4 … 0
US 2018 US-100 PP300 Peter fields jhon voigh 5 Question5 … 0
US 2018 US-100 PP300 Peter fields jhon voigh 6 Question6 … no
I tried to group the resulting dataset, but always get 14 rows instead of 2.
Code:
data %>%
group_by(country=.$country ,
date_ = .$date_,
medic_id=.$user_id,
user_id= .$int_id,
user_name= .$user_name,
ext_name= .$ext_name,
q_order=.$q_order
) %>%
spread(questions, answers)
The code above , give me an out of memory.
I even tried with dcast
data %>%
select(-q_order) %>%
dcast( ... ~ questions, value.var = "answers")
And i get the following:
Country.Code Created.Date user_id int_id user_name ext_name Question1 … Question2 … Question3 … Question4 … Question5 … Question6 …
AR 3/28/2019 AR-100 XP200 jhon foo damian, khon 1 2 0 1 1 1
US 4/28/2019 US-100 PP300 Peter fields jhon voigh 0 1 1 2 1 2
but i need :
Country.Code Created.Date user_id int_id user_name ext_name Question1 … Question2 … Question3 … Question4 … Question5 … Question6 …
AR 3/28/2019 AR-100 XP200 jhon foo damian, khon yes 0 no apply 0 0 yes
US 4/28/2019 US-100 PP300 Peter fields jhon voigh no 0 yes apply 0 0 no
Why dcast convert to numerical al the values from answers variable? (I even tried with var.values='answers')?
My question is very similar to this link!
But I cant make it run, always give out out memory or generates with numerical values instead of the values from answers variable.