0

I'm trying to get a matrix of means (to eventually plot the time series) for an individual variable, looped over 14 days (on each unique day, a unique user could have put a max of 5 values for said variable).

I've tried creating a loop with separate temps, but keep on running in to either 'script out of bounds' (despite the number of columns being sufficient) or the 'argument is not numeric or logical: returning NA' errors.

Completely new to R, so this is stressing me out a lot.

There's 41 participants with up to 5 recorded values on 8 different variables (some have less, those values are recorded as missing)

mat_varday <- matrix(nrow=nrow(as.data.frame(unique(data$ID))), ncol=14, NA) 

for(i in 1:41)  {                # loop through participants
  temp <- filter(data, ID == unique(data$ID)[i])
  for(j in 1:nrow(as.data.frame(unique(data$dayvar)))) {   # loop through days
    temp1 <- filter(temp, dayvar == unique(data$dayvar)[j])
    mat_varday[i,j] <- mean(temp1[,2], na.rm = TRUE)    
  }
}  

# plot time series
plot(colMeans(mat_varday, na.rm = TRUE), type="b", ylim=c(0,5),
     xlab="days", ylab="Total mean of boredom for all people")

I expect to get a matrix with mean score of variable 2, per user per day.

  • It's easier to help you if you include a simple [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) with sample input and desired output that can be used to test and verify possible solutions. Make sure `data` is defined in the question. – MrFlick May 08 '19 at 19:39

1 Answers1

0

Consider aggregate for your multiple grouping. Be sure to rename variable2

agg_df <- aggregate(variable2 ~ ID + dayvar, data, FUN=mean)
agg_df

And for plotting, consider tapply to build needed matrix:

mat_varday <- with(data, tapply(variable2, list(dayvar, ID), FUN=mean))
mat_varday

plot(colMeans(mat_varday, na.rm = TRUE), type="b", ylim=c(0,5),
     xlab="days", ylab="Total mean of boredom for all people")
Parfait
  • 104,375
  • 17
  • 94
  • 125