0

I'm trying to change a dataframe in R to group multiple rows by a measurement. The table has a location (km), a size (mm) a count of things in that size bin, a site and year. I want to take the sizes, make a column from each one (2, 4 and 6 in this example), and place the corresponding count into each the row for that location, site and year.

It seems like a combination of transposing and grouping, but I can't figure out a way to accomplish this in R. I've looked at t(), dcast() and aggregate(), but those aren't really close at all.

So I would go from something like this:

df <- data.frame(km=c(rep(32,3),rep(50,3)), mm=rep(c(2,4,6),2), count=sample(1:25,6), site=rep("A", 6), year=rep(2013, 6))

  km mm count site year
1 32  2    18    A 2013
2 32  4     2    A 2013
3 32  6    12    A 2013
4 50  2     3    A 2013
5 50  4    17    A 2013
6 50  6    21    A 2013

To this:

  km  site  year  mm_2  mm_4  mm_6
1 32    A   2013   18     2    12
2 50    A   2013    3    17    21

Edit: I tried the solution in a suggested duplicate, but I did not work for me, not really sure why. The answer below worked better.

Jaap
  • 81,064
  • 34
  • 182
  • 193
Evan
  • 1,960
  • 4
  • 26
  • 54

2 Answers2

4

As suggested in the comment above, we can use the sep argument in spread:

library(tidyr)
spread(df, mm, count, sep = "_")
  km site year mm_2 mm_4 mm_6
1 32    A 2013    4   20    1
2 50    A 2013   15   14   22
markus
  • 25,843
  • 5
  • 39
  • 58
2

As you mentioned dcast(), here is a method using it.

set.seed(1)
df <- data.frame(km=c(rep(32,3),rep(50,3)), 
                 mm=rep(c(2,4,6),2),
                 count=sample(1:25,6),
                 site=rep("A", 6),
                 year=rep(2013, 6))

library(reshape2)
dcast(df, ... ~ mm, value.var="count")

#   km site year  2  4  6
# 1 32    A 2013 13 10 20
# 2 50    A 2013  3 17  1

And if you want a bit of a challenge you can try the base function reshape().

df2 <- reshape(df, v.names="count", idvar="km", timevar="mm", ids="mm",  direction="wide")
colnames(df2) <- sub("count.", "mm_", colnames(df2))
df2

#   km site year mm_2 mm_4 mm_6
# 1 32    A 2013   13   10   20
# 4 50    A 2013    3   17    1
AkselA
  • 8,153
  • 2
  • 21
  • 34