0

I appreciate your time in reading this. I am trying to create a new, wide dataset in r from a long dataset. My dataset is set up something like this:

dd <- read.table(text="Year Basket Fruit
2014 small pear
2014 medium pear
2014 medium orange
2014 large pear
2014 large orange
2014 large apple
2015 small orange
2015 medium pear
2015 medium orange
2015 large pear
2015 large orange
2015 large pomegranate", header=TRUE)

I need the new dataset to have one row per basket type (small, medium, and large), and then a column for each fruit type and year combination, with a yes/no indication of whether that fruit type was present in that basket type that year. Something like this:

out <- read.table(text="
       apple.2014 orange.2014 pear.2014 pomegranate.2014 apple.2015 orange.2015 pear.2015 pomegranate.2015
large           1           1         1                0          0           1         1                1
medium          0           1         1                0          0           1         1                0
small           0           0         1                0          0           1         0                0
", header=TRUE)

Any suggestions on how to accomplish this would be extremely appreciated! I have found solutions for how to count the number of unique fruits by basket type, but no solution to create the type of data frame I need. Thanks so much!

MrFlick
  • 195,160
  • 17
  • 277
  • 295
  • Hello! I have edited my post to explain how it is not a duplicate of the post linked. Thank you! – user10756294 Dec 06 '18 at 19:14
  • 1
    Once you accept a duplicate, no more answers can be posted. Try this: `library(data.table);dt <- data.table(Year=c(rep(2014,2),rep(2015,2)),Basket=c("small","medium","large","small"),Fruit=c("pear","pear","orange","pomegranate")); dcast(dt,formula = Basket~Fruit+Year,fun.aggregate = length,value.var="Fruit")` – JPCampos Dec 06 '18 at 19:15
  • Oh, I see. I'm new here and did not realize I was accepting the duplicate. Thank you for your help! I will try this solution. – user10756294 Dec 06 '18 at 19:18
  • if you do not want to work with data.tables, you can change the data.table to data.frame and load the reshape2 library instead – JPCampos Dec 06 '18 at 19:26
  • It's easier to help you if you include a simple [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) with sample input and desired output that can be used to test and verify possible solutions. Pictures of data are not helpful. – MrFlick Dec 06 '18 at 20:08
  • Thank you MrFlick, I will keep it in mind for next time. I was trying to keep things simple because my data is much more complicated than the example, and I don't know that I would be able to create a reproducible sample of it if I tried! The project I'm working on has been a bit of a nightmare for me, unfortunately! I appreciate your input. – user10756294 Dec 06 '18 at 20:18

1 Answers1

0

This is basically just a table() operation to count values

with(dd, table(Basket, interaction(Fruit, Year)))
MrFlick
  • 195,160
  • 17
  • 277
  • 295