0

I'm new to programming in R and I'm working with a huge dataset containing hundreds of variables and thousands of observations. Among these variables there is Age, which is my main concern. I want to get means for each other variables in function of Age. I can get smaller tables with this:

for(i in 18:84) 
{
  n<- sprintf("SortAgeM%d",i)
  assign(x=n,subset(SortAgeM,subset=(SortAgeM$AGE>=i & SortAgeM$AGE<i+1)))
}
"SortAgeM85plus"<-subset(SortAgeM,subset=(SortAgeM$AGE>=85 & SortAgeM$AGE<100))

This gives me subdatasets for each age I'm concern with. I would then want to get the mean for each column. Each column is an observation of the volume of a specific brain region. I'm interested in knowing how is the volume decreasing with time and I would like to be able to know if individuals of a given age are close to the mean of their age or not.

Now, I would like to get one more row with the mean for each column. So I tried this:

for(i in 18:85) {
  addmargins((SortAgeM%d,i), margin=1, FUN= "mean")
}  

But it didn't work... I'm stuck and I'm not familiar enough with R function to find a solution on the net... Thank you for your help.

Victor

Post answer edit: This is what I finally did:

for(i in 18:84) 
    {
      n<- sprintf("SortAgeM%d",i)
      assign(x=n,subset(SortAgeM,subset=(SortAgeM$AGE>=i & SortAgeM$AGE<i+1)))
      Ajustment<-c(NA,NA,NA,NA,NA,NA,NA) #first variables aren't numeric
      Line1<- colMeans(item[,8:217],na.rm=TRUE)
      Line<-c(Ajustment,Ligne1)
      assign(x=n, rbind(item,Ligne))
    }
  • You need `get` and `assign` – akrun Aug 06 '18 at 18:42
  • 1
    This should be an easy question to answer, but it's unclear what exactly you want to do. If you could include an example of your starting data and desired output, you'll get an answer right away. Please read [Ask] and [How to make a great R reproducible example](https://stackoverflow.com/q/5963269/8366499) – divibisan Aug 06 '18 at 18:43
  • 1
    I would argue that using `get` and `assign` are almost always a bad idea, even for an R expert. Using them as a newbie is a recipe for total disaster. This sounds like a simple `dplyr` `filter %>% group_by %>% summarize` problem, but I'm not sure exactly what the desired outcome is – divibisan Aug 06 '18 at 18:45
  • I would like to get the exact same dataset with one more row containing the means of each column – Victor Pattee-Gravel Aug 06 '18 at 18:51
  • If you just want to get the means of each column, then (as @SmitM suggested) `colMeans(df)` does that in one line. In your question, however, you talk about getting `means for each other variables in function of Age` which sounds like more than that. If @SmitM answered your question, you should accept his answer, otherwise, **please** include examples of your starting data and desired output. – divibisan Aug 06 '18 at 19:22

1 Answers1

1

If you simply want an additional row with the means of each column, you can rbind the colMeans of your df like this

df_new <- rbind(df, colMeans(df))

SmitM
  • 1,366
  • 1
  • 8
  • 14