1

I intend to create multiple data frame from a data like below:

ID Time Ethnicity LDL  HDL ....
1   1   black
2   2   white
3   1   black
4   2   White

each data frame is mean values of the column LDL, HDL, ... in 4 rows displayed in the data. I used the following code but the problem is all the data frames are identical. I mean DF[[1]] is the same as DF[[2]], ...DF[[15]]. I would appreciate if you could help me find the solution.

dv=c(names(data[,4:15]))

library(ggplot2)
require(plyr)

for (i in 1:12) {
    DF[[i]] = ddply(data, c("Time", "Ethnicity"), summarize, 
    Mean = mean(data[[paste(dv[i])]], na.rm = T))
}
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213
Sana
  • 37
  • 4
  • Can you share a sample of the data - is it a text file? There might be an easier way to read it all in using string manipulation (and `regex` if needed). – Gautam Apr 06 '20 at 02:31
  • Welcome to Stack Overflow! Could you make your problem reproducible by sharing a sample of your data so others can help (please do not use `str()`, `head()` or screenshot)? You can use the [`reprex`](https://reprex.tidyverse.org/articles/articles/magic-reprex.html) and [`datapasta`](https://cran.r-project.org/web/packages/datapasta/vignettes/how-to-datapasta.html) packages to assist you with that. See also [Help me Help you](https://speakerdeck.com/jennybc/reprex-help-me-help-you?slide=5) & [How to make a great R reproducible example?](https://stackoverflow.com/q/5963269) – Tung Apr 06 '20 at 03:09

1 Answers1

0

plyr is retired, you could use dplyr. When you do mean(data[[paste(dv[i])]], you are subsetting the entire column and not respecting groups. Hence, you get the same mean for all the values in DF[[1]], DF[[2]] etc.

library(dplyr)

output_df <- data %>% 
               group_by(Time, Ethnicity) %>% 
               summarise_at(4:15, mean, na.rm = TRUE) %>% 
               ungroup

If you want list of dataframes you could use group_split :

DF <- output_df %>% group_split(Time, Ethnicity)
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213