I have a data frame that I am trying to condense from multiples rows into one row.The data set is fairly large, but I am starting with a small subset. So here I want to turn 2 rows into 1; I want the information to follow the information in the first row.
The original problem was that I had a column of data that I need to "flatten" so that I can use the bits and pieces. The column is in JSON format.
"[{\"task\":\"T0\",\"task_label\":\"Did any birds visit the feeding platform or bird feeders?\",\"value\":\"**Yes**—but there were no displacements. Next, enter all of the birds you see at the feeders. \"},{\"task\":\"T1\",\"value\":[{\"choice\":\"EUROPEANSTARLING\",\"answers\":{\"WHATISTHELARGESTNUMBEROFINDIVIDUALSTHATYOUSAWSIMULTANEOUSLY\":\"4\"},\"filters\":{}},{\"choice\":\"MOURNINGDOVE\",\"answers\":{\"WHATISTHELARGESTNUMBEROFINDIVIDUALSTHATYOUSAWSIMULTANEOUSLY\":\"2\"},\"filters\":{}}]},{\"task\":\"T6\",\"task_label\":\"Is it actively precipitating (rain or snow)?\",\"value\":[\"Yes.\"]}]"
So I used code developed by another coder to "flatten" this out by task. Then, I want to join it back up so that I have one line of information for each classification.
Currently, I have merged tasks T0 and T4, but I need to merge this to another task, T5. In order to do that, I need to reduce the data in merge of T0 and T4 to one row. So right now I'm working with a small subset of the data and have a table that essentially looks like this:
x <- data.frame("subject_ids" = c(19232716, 19232716), "classification_id" = c(120545061,120545061), "task_index.x" = c(1,1),
"task.x" = c("TO","TO"), "value" = c("Displacement","Displacement"), "task_index.y"=c(2,5), "task.y"= c("T4, T4","T4"),
"total.species"=c("2,2","1"), "choice" = c("MOURNINGDOVE, COMMONGRACKLE","MOURNINGDOVE"), "S_T"=c("Target,Target","Target,Source"))
but I want it to look like this:
y <- data.frame("subject_ids" = c(19232716), "classification_id" = c(120545061), "task_index.x" = c(1),
"task.x" = c("TO"), "value" = c("Displacement"), "task_index.y"=c(2), "task.y"= "T4, T4",
"total.species"=c("2,2"), "choice" = c("MOURNINGDOVE, COMMONGRACKLE"), "S_T"=c("Target,Target"),
"task_index.y"=c(5), "task.y"= "T4",
"total.species"=c("1"), "choice" = c("MOURNINGDOVE"), "S_T"=c("Target,Source"))