0

I have a set of training data, with data recorded every second.

Rows: 897
Columns: 8
$ session.date <dttm> 2020-04-09 07:12:00, 2020-04-09 07:12:00, 2020-04-09 07:12:00, 2020-04-09 07:12:00, 2020-04-09 07:12:00, 2020-04-09 07:12:00, 2020-04-09 07:12:00, 2020-04-09 07:12:00, 2020-04-09 07:12:00, 2020...
$ secs         <int> 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, ...
$ power        <dbl> 187.000, 187.000, 187.000, 187.000, 194.250, 201.500, 208.750, 216.000, 225.000, 234.000, 243.000, 246.000, 249.000, 252.000, 255.000, 252.000, 249.000, 246.000, 245.250, 244.500, 243.750, 243.0...
$ distance     <dbl> 0.0000, 5.0000, 10.0000, 13.9900, 17.9925, 21.9950, 25.9975, 30.0000, 35.6633, 41.3267, 46.9900, 50.9925, 54.9950, 58.9975, 63.0000, 68.3333, 73.6667, 79.0000, 83.2500, 87.5000, 91.7500, 96.0000...
$ cadence      <dbl> 0.0000, 0.0000, 0.0000, 0.0000, 4.2500, 8.5000, 12.7500, 17.0000, 16.6667, 16.3333, 16.0000, 16.2500, 16.5000, 16.7500, 17.0000, 16.6667, 16.3333, 16.0000, 16.0000, 16.0000, 16.0000, 16.0000, 16...
$ heart_rate   <dbl> 0.000, 50.000, 100.000, 102.000, 104.000, 106.000, 108.000, 110.000, 112.333, 114.667, 117.000, 118.500, 120.000, 121.500, 123.000, 125.333, 127.667, 130.000, 130.750, 131.500, 132.250, 133.000,...
$ speed        <dbl> 4.055000, 4.055000, 4.055000, 4.055000, 4.105750, 4.156500, 4.207250, 4.258000, 4.313333, 4.368667, 4.424000, 4.442000, 4.460000, 4.478000, 4.496000, 4.480000, 4.464000, 4.448000, 4.442000, 4.43...
$ min          <dbl> 0.00, 0.02, 0.03, 0.05, 0.07, 0.08, 0.10, 0.12, 0.13, 0.15, 0.17, 0.18, 0.20, 0.22, 0.23, 0.25, 0.27, 0.28, 0.30, 0.32, 0.33, 0.35, 0.37, 0.38, 0.40, 0.42, 0.43, 0.45, 0.47, 0.48, 0.50, 0.52, 0....

This data is from an interval set where there is 4 x 90 secs of "work" and 30 sec of rest. I would like to average the power for each 90 second work period and then get the overall average for the "work" section of the session. I am able to this individually for the time periods I want.

library(dplyr)
df <- Data %>%
filter(min>=0 & min <= 1.5) %>%
group_by(session.date) %>%
summarise(ave_power = mean(power))

However this seems tedious to do for all the different time points I want to get the mean for. i.e. 0-1.5 min, 2-3.5 min, 4-5.5 min, 6-7.5 min and then to average those.

Does anyone have any suggestions on a quicker/ more efficient way to do this? Thank you

Earl Mascetti
  • 1,278
  • 3
  • 16
  • 31
wattss
  • 61
  • 5

2 Answers2

0

One method could be to use modular arithmetic to group the date into 120 second sections. This only works if the data is in intervals of 90 + 30 seconds, as you say.

Start by generating some sample data. We can use 4 * 120 = 480 seconds of data with some randomly generated power readings.

library(tidyverse)

df <- tibble(secs = 0:479, power = runif(480, 180, 250))

We can then group the data into 120 second long chunks by taking the integer division of secs by 120. The seconds by group can also be calculated as mod(secs, 120).

df <- df %>% 
  mutate(
    group = secs %/% 120 + 1,
    secs_group = secs %% 120
  ) %>% 
  mutate(period = if_else(secs_group < 90, "work", "rest")) 

# # A tibble: 480 x 5
#     secs power group secs_group period
#    <int> <dbl> <dbl>      <dbl> <chr> 
#  1     0  230.     1          0 work  
#  2     1  246.     1          1 work  
#  3     2  248.     1          2 work  
#  4     3  232.     1          3 work  
#  5     4  223.     1          4 work  
#  6     5  243.     1          5 work  
#  7     6  243.     1          6 work  
#  8     7  194.     1          7 work  
#  9     8  192.     1          8 work  
# 10     9  247.     1          9 work  
# # ... with 470 more rows

I've also identified each record as either 'work' or 'rest' depending on if it's in the first 90 seconds of the group.

The average power can then be calculated for each group

df %>% 
  filter(period == "work") %>% 
  group_by(group) %>% 
  summarise(mean_power = mean(power))

# # A tibble: 4 x 2
#   group mean_power
# * <dbl>      <dbl>
# 1     1       216.
# 2     2       216.
# 3     3       217.
# 4     4       214.
Callum Savage
  • 341
  • 2
  • 7
0

Please consider making a reproducible example (including data) as per How to make a great R reproducible example

To get the average for each 90 second period you can make a new variable with the different time intervals and then group by these:

library(dplyr)
Data %>%
  mutate(Cuts = cut(min, seq(from = 0, to = 60, by = 1.5))) %>%
  group_by(session.data, Cuts) %>%
  summarise(ave_power = mean(power)) %>%
  ungroup()
Earl Mascetti
  • 1,278
  • 3
  • 16
  • 31
robertdj
  • 909
  • 1
  • 5
  • 10