I am currently dealing with multiple time series within a grouped/nested dataframe and I do not know how to proceed. I hope that someone here can help me on this matter.
My dataset consist of 5 grouping variables (Category
), each of which contains 10 replicates (ID
). Each of this replicate consists in 60 consecutive observations at regular time intervals (Time
). Each of this observations then consists of multiple variables: Amount
, Walk_1
, Walk_2
and Walk_1+2
. However, I also have three other variables (Amount1
, Amount2
, Amount3
) which consist in the Number
variable splitted by area.
So, to have it graphically:
A { (1, 2, 3, 4, 5, 6, 7, 8, 9, 10) { 1(1, 2, 3, 4, ... , 60); 2(1, 2, 3, 4, ... , 60); ..and so on.
B { (11, 12, 13, 14, 15, 16, 17, 18, 19, 20) { 11(1, 2, 3, 4, ... , 60); 12(1, 2, 3, 4, ... , 60); ..and so on.
.
.
E { (41, 42, 43, 44, 45, 46, 47, 48, 49, 50) { 41(1, 2, 3, 4, ... , 60); 42(1, 2, 3, 4, ... , 60); ..and so on.
What I want to know is whether the variables walk_1
, walk_2
and walk_1+2
:
- Differ within and across the grouping variables
Category
orID
- show an oscillatory/periodic trend over the 60 observations.
I tried to nest the dataset into ID
categories, then apply the acf
function by group using the mutate
+ map
function. This way, I obtain a series of acf values. However, I do not know how to proceed for visualising and analysing the dataset.
Is this method functional to my purpose, or should I use a different function?
EDIT: Since I was asked to put a reproducible example, here is some similar data structure with a simplified fake dataset.
dat1 = data.frame(
category = rep(x = 1:5, each=600),
ID = rep(rep(1:50,each=60)),
walk_1 = rnorm(n = 3000,mean = 10,sd = 4),
walk_2 = rnorm(n = 3000, mean = 5, sd = 5),
amount = sample.int(50, 3000, replace = TRUE))
dat1 <- dat1 %>%
group_by(ID) %>%
mutate(walk_12 = walk_1 + walk_2) %>%
nest()
dat1 <- dat1 %>%
mutate(data = map(data,
~ mutate(., Time = seq(1, 60, by = 1)))) %>%
unnest()
dat1 <- dat1 %>%
mutate(walk_1 = ifelse(Time == 1, NA, walk_1)) %>%
mutate(walk_2 = ifelse(Time == 1, NA, walk_2)) %>%
mutate(walk_12 = ifelse(Time == 1, NA, walk_12))
And here's the output of dput(head(dat1)):
structure(list(ID = c(1L, 1L, 1L, 1L, 1L, 1L), category = c(1L,
1L, 1L, 1L, 1L, 1L), walk_1 = c(NA, 7.93428744179875, 11.6326574689602,
10.3687325793843, 6.2631358473095, 14.0134490135895), walk_2 = c(NA,
-4.03741066457775, 2.91290193315445, 8.04203547142631, 9.42608080771425,
16.2253066800552), amount = c(34L, 37L, 31L, 26L, 29L, 33L),
walk_12 = c(NA, 3.896876777221, 14.5455594021146, 18.4107680508106,
15.6892166550237, 30.2387556936447), Time = c(1, 2, 3, 4,
5, 6)), row.names = c(NA, -6L), groups = structure(list(ID = 1L,
.rows = structure(list(1:6), ptype = integer(0), class = c("vctrs_list_of",
"vctrs_vctr", "list"))), row.names = 1L, class = c("tbl_df",
"tbl", "data.frame"), .drop = TRUE), class = c("grouped_df",
"tbl_df", "tbl", "data.frame"))
@Marco Here, amount
is a count of individuals and thus previous measurements within the same ID
should not be independent. Amount1
, Amount2
and Amount3
are correlated in the sense that Amount1
+ Amount2
+ Amount3
= amount
.
What I need to do is to understand whether walk_1
, walk_2
and (consequently) walk_1+2
have periodicity and, if tha is the case, whether this periodicity is equal among different ID
within or across the different Category
. Also, I do not have data for any first time point in each ID
, so the time series to analyse is 59 points.
I hope that this clarify my question. Thanks for your help!