0

I am analyzing changes in unique users' ratings for different movies in time. In order to find significant changepoints in time, I am using the 'changepoint' package and the PELT method.

  • I understand that there are different types of penalties, however, I am still unsure which one to use.
  • I tried to make an elbow plot to see the optimal number of changes, but somehow it does not work. Here is what I have so far, based on, for example, the movie "Inception".
  • Also, are all changepoints significant? Is there a way to prove
    significance?

My data: timestamp_date = date; cummean = all ratings for the day:

timestamp_date  cummean
18-07-2010  4.15384615
19-07-2010  4.23809524
20-07-2010  4.23880597
21-07-2010  4.24390244
22-07-2010  4.19387755
23-07-2010  4.21186441
24-07-2010  4.23758865
25-07-2010  4.28804348
26-07-2010  4.32126697
27-07-2010  4.34063745
28-07-2010  4.36330935
29-07-2010  4.35521886
30-07-2010  4.35448916
31-07-2010  4.34005764
1-08-2010   4.34741144
2-08-2010   4.35604113
3-08-2010   4.34725537
4-08-2010   4.33073497
5-08-2010   4.34051724
6-08-2010   4.34114053
7-08-2010   4.3467433
8-08-2010   4.32909091
9-08-2010   4.32901554
10-08-2010  4.32171799
11-08-2010  4.32316119
12-08-2010  4.32375189
13-08-2010  4.32532751
14-08-2010  4.32932011
15-08-2010  4.32855191
16-08-2010  4.33266932
17-08-2010  4.33246415
18-08-2010  4.33312102
19-08-2010  4.32982673
20-08-2010  4.33212121
21-08-2010  4.33195755
22-08-2010  4.33198614
23-08-2010  4.33370913
24-08-2010  4.3342511
25-08-2010  4.33441208
26-08-2010  4.33439153
27-08-2010  4.33541018
28-08-2010  4.331643
29-08-2010  4.32954545
30-08-2010  4.32992203
31-08-2010  4.330468
1-09-2010   4.33002833
2-09-2010   4.32679739
3-09-2010   4.32763401
4-09-2010   4.33091568
5-09-2010   4.33081033
6-09-2010   4.3289358
7-09-2010   4.33072917
8-09-2010   4.33104631
9-09-2010   4.33347422
10-09-2010  4.33430962
11-09-2010  4.33251029
12-09-2010  4.33292782
13-09-2010  4.33360129
14-09-2010  4.33359936
15-09-2010  4.33307024
16-09-2010  4.33268025
17-09-2010  4.33256528
18-09-2010  4.33358548
19-09-2010  4.33247232
20-09-2010  4.33734088
21-09-2010  4.33758621
22-09-2010  4.34044715
23-09-2010  4.34026846
24-09-2010  4.33878505
25-09-2010  4.33542631
26-09-2010  4.33409836
27-09-2010  4.33268482
28-09-2010  4.3332256
29-09-2010  4.33451157
30-09-2010  4.33545108
1-10-2010   4.33470032
2-10-2010   4.33550995
3-10-2010   4.33374384
4-10-2010   4.33455882
5-10-2010   4.33638026
6-10-2010   4.33704819
7-10-2010   4.33871933
8-10-2010   4.33881579
9-10-2010   4.33718861
10-10-2010  4.33931725
11-10-2010  4.34020918
12-10-2010  4.33927545
13-10-2010  4.33714286
14-10-2010  4.33730835

My code:

inds <- seq(as.Date("2010-07-18"), as.Date("2010-10-14"), by = "day")
myts <- ts(inception$cummean, start = c(2010, as.numeric(format(inds[1], "%j"))), frequency = 365)

#single changepoint: method AMOC
cpt <- changepoint::cpt.meanvar(myts)
cpts(cpt)
cpts.ts(cpt)
param.est(cpt)
plot(cpt)
summary(cpt)

#multiple changepoints: method PELT 
mcpt <- changepoint::cpt.meanvar(myts, method = "PELT")
cpts(mcpt)
cpts.ts(mcpt)
param.est(mcpt) 
ncpts(mcpt) 
plot(mcpt)
summary(mcpt)

Also, as I use a ts.object, I cannot convert the date to appear right when I plot cpt, what am I doing wrong?

Thank you!!

Paul
  • 2,850
  • 1
  • 12
  • 37
  • 1
    I'm not sure what you're actually asking, this may require more focus. Also, the part whether your method is correct is off-topic on Stack Overflow and you'd better ask that on [Cross Validated](https://stats.stackexchange.com/help/on-topic). – jay.sf Jun 18 '20 at 08:58
  • If I understand, you have 2 questions about methods ("which method to use?" and "are changepoints significants?" These, as mentioned by @jay.sf are more likely to be answered on Cross Validated. Your question about "how to make my elbow plot?" is on the right forum. Can you please make your example [reproducible](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example), for example, use `dput()` or some base R datasets (such as `iris` or `mtcars`). – Paul Jun 18 '20 at 09:37

0 Answers0