0

I have a data frame that I am trying to condense from multiples rows into one row.The data set is fairly large, but I am starting with a small subset. So here I want to turn 2 rows into 1; I want the information to follow the information in the first row.

The original problem was that I had a column of data that I need to "flatten" so that I can use the bits and pieces. The column is in JSON format.

"[{\"task\":\"T0\",\"task_label\":\"Did any birds visit the feeding platform or bird feeders?\",\"value\":\"**Yes**—but there were no displacements. Next, enter all of the birds you see at the feeders. \"},{\"task\":\"T1\",\"value\":[{\"choice\":\"EUROPEANSTARLING\",\"answers\":{\"WHATISTHELARGESTNUMBEROFINDIVIDUALSTHATYOUSAWSIMULTANEOUSLY\":\"4\"},\"filters\":{}},{\"choice\":\"MOURNINGDOVE\",\"answers\":{\"WHATISTHELARGESTNUMBEROFINDIVIDUALSTHATYOUSAWSIMULTANEOUSLY\":\"2\"},\"filters\":{}}]},{\"task\":\"T6\",\"task_label\":\"Is it actively precipitating (rain or snow)?\",\"value\":[\"Yes.\"]}]"

So I used code developed by another coder to "flatten" this out by task. Then, I want to join it back up so that I have one line of information for each classification.

Currently, I have merged tasks T0 and T4, but I need to merge this to another task, T5. In order to do that, I need to reduce the data in merge of T0 and T4 to one row. So right now I'm working with a small subset of the data and have a table that essentially looks like this:

x <- data.frame("subject_ids" = c(19232716, 19232716), "classification_id" = c(120545061,120545061), "task_index.x" = c(1,1),
              "task.x" = c("TO","TO"), "value" = c("Displacement","Displacement"), "task_index.y"=c(2,5), "task.y"= c("T4, T4","T4"),
              "total.species"=c("2,2","1"), "choice" = c("MOURNINGDOVE, COMMONGRACKLE","MOURNINGDOVE"), "S_T"=c("Target,Target","Target,Source"))

but I want it to look like this:

y <- data.frame("subject_ids" = c(19232716), "classification_id" = c(120545061), "task_index.x" = c(1),
             "task.x" = c("TO"), "value" = c("Displacement"), "task_index.y"=c(2), "task.y"= "T4, T4",
             "total.species"=c("2,2"), "choice" = c("MOURNINGDOVE, COMMONGRACKLE"), "S_T"=c("Target,Target"),
             "task_index.y"=c(5), "task.y"= "T4",
             "total.species"=c("1"), "choice" = c("MOURNINGDOVE"), "S_T"=c("Target,Source"))
Rachael
  • 33
  • 7
  • 2
    According to your description it's not clear what you actually want to do. Do you want to summarise your data? It's also easier to give you a solution once you provide some sample data in your question we can paste in the R console. Unfortunately, it's hard to find a solution based on your screenshot alone. – alex_555 Oct 11 '18 at 14:08
  • 2
    Hi Rachael and welcome to StackOverflow! Could you show us an example of what the target data.frame should look like? – Rekyt Oct 11 '18 at 14:09
  • @Rekyt and alex_555, what is the best way to show you an example of the data? Sorry if that is an obvious question, but I can't seem to figure it out! – Rachael Oct 11 '18 at 14:20
  • @Rachael The best would be to copy-paste the text from your data.frame to your question and show it in "code" mode. The other thing would be to try to show an exemple written the same way of what the target table should look like (you type the shape of the expected table). – Rekyt Oct 11 '18 at 14:22
  • Everything you need to know about "best way to show example data" can be found in https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example - I suggest you read the answers there in detail – dww Oct 11 '18 at 14:54
  • @dww so I have the data, I'm not sure how to post it here for you all to see. – Rachael Oct 11 '18 at 15:13
  • 2
    @Rekyt thank you very much, it finally clicked what I needed to do (hopefully). I made serious edits. – Rachael Oct 11 '18 at 15:34

0 Answers0