1

This is my Dataframe: https://gofile.io/?c=7WLqCD

It looks like this:

head(testframe)

       Time         Station1  Station2  Station3  Station4
 01.01.2017 07:00      27         38         26         25
 01.01.2017 14:00      22         49         25         16
 01.01.2017 21:00      41         53         46         36
 02.01.2017 07:00      22         38         26         19
 02.01.2017 14:00      20         54         35         13
 02.01.2017 21:00      36         45         30         26

I want to calculate the mean values for Station 1 to Station 4 for every day, that means row 1-3, row 4-6, row 7-9 and so on.

class (testframe$Station1) is factor and I know that it has to be numeric to calculate the mean value. So I tried to convert it like this:

testframe[,4] = as.numeric(as.character(testframe$Station4))

This does not work. I have missing values that are marked as #. I replaced them with NA, but there are still problems with Station 3 and Station 4.

Also this code to calculate the mean, doesnt work. It gives me the wrong results.

colMeans(matrix(testframe$Station1, nrow=3))
Essi
  • 761
  • 3
  • 12
  • 22

2 Answers2

4

EDIT: After OP's changes: With dplyr:

df %>% 
 rename(Date=row.names) %>% 
   group_by(Date) %>% 
   summarise_at(vars(contains("S")),list(Mean=mean))
# A tibble: 2 x 5
  Date       Station1_Mean Station2_Mean Station3_Mean Station4_Mean
  <chr>              <dbl>         <dbl>         <dbl>         <dbl>
1 01.01.2017            30          46.7          32.3          25.7
2 02.01.2017            26          45.7          30.3          19.3

Data:

df<-read.table(text="       Time         Station1  Station2  Station3  Station4
 01.01.2017 07:00      27         38         26         25
               01.01.2017 14:00      22         49         25         16
               01.01.2017 21:00      41         53         46         36
               02.01.2017 07:00      22         38         26         19
               02.01.2017 14:00      20         54         35         13
               02.01.2017 21:00      36         45         30         26",header=T,
               as.is=T,fill=T,row.names = NULL)

Original Answer:(Get mean for every 3rd row)

We can do the following(I've filtered to remove non-numerics):

colMeans(df[seq(0,nrow(df),3),-c(1,2)])
Station1 Station2 Station3 Station4 
    38.5     49.0     38.0     31.0 

Data:

df<-structure(list(row.names = c("01.01.2017", "01.01.2017", "01.01.2017", 
"02.01.2017", "02.01.2017", "02.01.2017"), Time = c("07:00", 
"14:00", "21:00", "07:00", "14:00", "21:00"), Station1 = c(27L, 
22L, 41L, 22L, 20L, 36L), Station2 = c(38L, 49L, 53L, 38L, 54L, 
45L), Station3 = c(26L, 25L, 46L, 26L, 35L, 30L), Station4 = c(25L, 
16L, 36L, 19L, 13L, 26L)), class = "data.frame", row.names = c(NA, 
-6L))
NelsonGon
  • 13,015
  • 7
  • 27
  • 57
  • 1
    Thanks! I think I explained my question wrong! I want a mean of row 1-3, 4-6, 7-9 and so on....so in my example it would be the mean all observations of one day. – Essi May 07 '19 at 15:25
  • That's a slightly different approach. Please edit your question by adding your data as `dput(head(mydata,10))`. – NelsonGon May 07 '19 at 15:27
  • 1
    @NelsonGon's conversion to date here should be noted. While it doesn't _exactly_ answer what OP asked (row groups of 3), given the background OP gave, it's almost certainly a better way to go about it. – zack May 07 '19 at 15:41
  • Right! Thanks, I'll add what led to the change. – NelsonGon May 07 '19 at 15:43
3

Probably you need something like that

library(dplyr)
df %>%
  group_by(group = gl(n()/3, 3)) %>%
  summarise_at(-1, mean, na.rm = TRUE)

#  group Station1 Station2 Station3 Station4
#  <fct>    <dbl>    <dbl>    <dbl>    <dbl>
#1  1         30     46.7     32.3     25.7
#2  2         26     45.7     30.3     19.3
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213
  • I just updated at the same time unfortunately but they're slightly different approaches all the same. – NelsonGon May 07 '19 at 15:37
  • 1
    @NelsonGon No problem, at least we could verify our answers give the same value :) – Ronak Shah May 07 '19 at 15:38
  • Thank you so much! What do I have to change when I want to find the mean over every 24 rows? Does the gl(n()/3, 3)) mean, take 3 rows together and devide it thru 3? And how can I find the maximum of every 24 rows? – Essi May 08 '19 at 11:34
  • 1
    @Essi to get mean of every 24 rows you could do, `df %>% group_by(group = gl(n()/24, 24)) %>% summarise_at(-1, mean, na.rm = TRUE)`. You can check how it works by doing `gl(12/3, 3)`. It creates 4 groups with length of 3 each. – Ronak Shah May 08 '19 at 11:38
  • Thanks!! It is working!!!! One more thing: How do I round the means in that formula that I have full numbers, like 47 instead of 46.7? – Essi May 08 '19 at 11:56
  • 1
    @Essi you can use `round` like `df %>% group_by(group = gl(n()/3, 3)) %>% summarise_at(-1, list(~round(mean(., na.rm = TRUE))))` – Ronak Shah May 08 '19 at 12:14