here is my questions: I got data with 3000 obs. and 5000 features, the 3000 obs. has a numeric names like 100.1,100.3,100.5,100.7. I changed the names into a integer variables by segs <-as.integer(names)
, then I want to use segs
as a factor to sum all of the 3000 features. The length of the segs
is 300 so the final data frame is 300 by 5000. I know tapply
could be used to get the sum by factor for one variable but I have to use for
to get all of the 5000 features summed. It is really time-consuming, so I want to know if there is a clear way in R to solve those problems or if there are some packages to solve this kind of problem.
This is the dirty code and df0
is the data while df
is what I want:
df <- data.frame()
for(i in 2:ncol(df0)-1){
temp <- tapply(df0[,i],df2$segs,sum)
df <- cbind(df,temp)
}
Thanks!
=====
Thanks, Roland, a demo data is shown as follows:
set.seed(42)
df0 <- data.frame(
X = rnorm(100,10,10),
Y = rnorm(100),
Z = rnorm(100))
df0$seq <- as.integer(df0$X)