What you need to focus on is a means to group your column of interest (the "detrend") by the month. There are ways to do this within "vanilla R", but the most effective way is to use tidyverse
's dplyr
.
I will use the example taken directly from that page:
mtcars %>%
group_by(cyl) %>%
summarise(disp = mean(disp), sd = sd(disp))
In your case, that would be:
by_month <- dataSet %>%
group_by(month) %>%
summarize(avg = mean(detrend))
This new "tidyverse
" style looks quite different, and you seem quite new, so I'll explain what's happening (sorry if this is overly obvious):
- First, we are grabbing the dataframe, which I'm calling
dataSet
.
- Then we are piping that dataset to our next function, which is
group_by
. Piping means that we're putting the results of the last command (which in this case is just the dataframe dataSet
) and using it as the first parameter of our next function. The function group_by
has a dataframe provided as its first function.
- Then the results of that group by are piped to the next function, which is
summarize
(or summarise
if you're from down under, as the author is). summarize
simply calculates using all the data in the column, however, the group_by
function creates partitions in that column. So we now have the mean calculated for each partition that we've made, which is month.
- This is the key:
group_by
creates "flags" so that summarize
calculates the function (mean
, in this case) separately on each group. So, for instance, all of the Jan
values are grouped together and then the mean
is calculated only on them. Then for all of the Feb
values, the mean is calculated, etc.
HTH!!