1

I have this dataframe:

  sp    rd  pH abund area point
 dog  uniq 4.5     5    1     a
 dog  uniq 4.2     5    1     a
 dog   for 6.1     3    1     a
 cat  uniq 7.0     8    1     a
 cat  uniq 4.9     5    1     a
 cat mains 3.1     9    1     b
 cat mains 6.5     1    1     b
 cat mains 6.5     3    1     b
 dog   for  NA     2    2     a
bird   mac 5.0     3    2     a
bird   mac 4.1     5    2     a
bird   mac 5.1    NA    2     a
rabb   lol 5.0     8    2     b
rabb   lol 4.2     5    2     b
rabb   lol 6.0     2    2     b
rabb   lol 2.8     3    2     b

In this dataframe, there are unique combinations of area and point, which are places. In this places there are animals defined by sp and rd. My goal is to get a list of vector, where each vector is sum of abund of sp within each place. In this case first vector of my list should be (13,13) because in place 1 a there are three dogs whith abund 5+5+3 and two cats with abund 8+5.

My idea of solution was to divide my dataframe into groups (places) and within those sub-dataframes do the aggregate. But problem is, when I split this dataframe into smaller ones, those dataframe will loose names (each dataframe has only name of its combination, for example 1a) so I cant apply the aggregate function.

Bobesh
  • 1,157
  • 2
  • 15
  • 30

1 Answers1

2

Since you say you want a list of vectors as the result and not a data.frame, I think the following is what you want:

Firstly, use split as you did in your initial approach to split the data.frame into groups:

splits <- split(df, list(df$area, df$point))

> splits
$`1.a`
   sp   rd  pH abund area point
1 dog uniq 4.5     5    1     a
2 dog uniq 4.2     5    1     a
3 dog  for 6.1     3    1     a
4 cat uniq 7.0     8    1     a
5 cat uniq 4.9     5    1     a

$`2.a`
     sp  rd  pH abund area point
9   dog for  NA     2    2     a
10 bird mac 5.0     3    2     a
11 bird mac 4.1     5    2     a
12 bird mac 5.1    NA    2     a
#and so on...

And then aggregate using aggregate as per the following:

#using lapply the aggregate function is applied
#on each of the previous splits
agg_splits <-
lapply(splits, function(x) {
  aggregate(abund ~ sp + area + point, data = x, FUN=sum)
})

Output:

> agg_splits
$`1.a`
   sp area point abund
1 cat    1     a    13
2 dog    1     a    13

$`2.a`
    sp area point abund
1 bird    2     a     8
2  dog    2     a     2

$`1.b`
   sp area point abund
1 cat    1     b    13

$`2.b`
    sp area point abund
1 rabb    2     b    18

Seems to be what you need.

LyzandeR
  • 37,047
  • 12
  • 77
  • 87
  • This is awesome! Thank you very much. Last question, how can I get a list where are just numbers of abund, not whole data frames? – Bobesh Oct 16 '15 at 10:31
  • You are welcome, glad I could help :). You could do: `lapply(agg_splits, function(x) x[, 'abund']) ` and you will get just a list with the abunds. – LyzandeR Oct 16 '15 at 10:33
  • Or you could do: `agg_splits <- lapply(splits, function(x) { aggregate(abund ~ sp + area + point, data = x, FUN=sum)['abund'] })` do get them in the first `lapply` call. – LyzandeR Oct 16 '15 at 10:37