Using lapply on subsets

Question

I want to generate a vector of means derived from subsets of an existing vector in R.

My data look like this:

date    plant_ID    treatment   stalk_count flower_count
195     1           control     0           0
196     1           control     0           0
197     1           control     0           0
198     1           control     0           0
.........................................................
237     98          treatment   0           0
239     98          treatment   0           0
226     98          treatment   2           9

I think I need to use split() to break the data into subsets by plant_ID, but do not know how to tell lapply() to take these subsets, and apply the mean() function to the flower_count data contained within each subset.

My questions are: 1- Is this an approach that will work? 2- How would I write the code to do this?

score -1 · Accepted Answer · answered Jan 23 '17 at 18:46

-1

We don't need to split, it is possible to get the mean of the 'flower_count' by a group by operation with aggregate from base R

aggregate(flower_count~plant_ID, df1, FUN = mean)

Or using dplyr

library(dplyr)
df1 %>%
   group_by(plant_ID) %>%
   summarise(flowercountMean = mean(flower_count))

If we want to specifically use lapply with split

lapply(split(df1$flower_count, df1$plant_ID), mean)

answered Jan 23 '17 at 18:46

akrun

874,273
37
540
662

Thank you akrun, that's quite helpful. I'm not able to get it to run yet however. What does the "df1" term mean? – JKO Jan 23 '17 at 20:04
@JKO Suppose you read the dataset `df1 <- read.csv("yourdata.csv", header=TRUE, stringsAsFactors=FALSE)` then that `df1` is the df1 mentioned in my answer. – akrun Jan 23 '17 at 20:05
1

Ah, great, thank you very much! – JKO Jan 23 '17 at 20:07

Using lapply on subsets

1 Answers1