3

The idea is to convert a frequency table to something geom_density can handle (ggplot2).

Starting with a frequency table

> dat <- data.frame(x = c("a", "a", "b", "b", "b"), y = c("c", "c", "d", "d", "d"))
> dat
  x y
1 a c
2 a c
3 b d
4 b d
5 b d

Use dcast to make a frequency table

> library(reshape2)
> dat2 <- dcast(dat, x + y ~ ., fun.aggregate = length)
> dat2
  x y count
1 a c     2
2 b d     3

How can this be reversed? melt does not seem to be the answer:

> colnames(dat2) <- c("x", "y", "count")
> melt(dat2, measure.vars = "count")
  x y variable value
1 a c    count     2
2 b d    count     3
nacnudus
  • 6,328
  • 5
  • 33
  • 47

1 Answers1

0

As you can use any aggregate function, you won't be able to reverse the dcast (aggregation) without knowing how to reverse the aggregation.

For length, the obvious inverse is rep. For aggregations like sum or mean there isn't an obvious inverse (that assumes you haven't saved the original data as an attribute)

Some options to invert length

You could use ddply

library(plyr)
ddply(dat2,.(x), summarize, y = rep(y,count))

or more simply

as.data.frame(lapply(dat2[c('x','y')], rep, dat2$count))
mnel
  • 113,303
  • 27
  • 265
  • 254
  • What about `dat2[rep(row.names(dat2), dat2$count), 1:2]`? – A5C1D2H2I1M1N2O1R2T1 Aug 02 '13 at 06:40
  • If you make that comment an answer I'll accept it. There's always a harder way to do something in R, isn't there? – nacnudus Aug 02 '13 at 06:53
  • @mnel, neither of those work for me. Apologies if I'm missing something basic, but the first one's error is `Error in NextMethod() : cannot coerce type 'closure' to vector of type 'integer'`, the other's is `Error in rep.default(X[[1L]], ...) : invalid 'times' argument`. Rep is, indeed, the obvious solution but it doesn't seem to be very good at replicating whole rows of data frames. – nacnudus Aug 02 '13 at 07:10
  • @nacnudus they work on your example. – mnel Aug 02 '13 at 07:13