0

i am trying to use time-series clustering, following the excellent examples of https://cran.r-project.org/web/packages/dtwclust/vignettes/dtwclust.pdf . However, when I use partitional clustering, the seed is very crucial to the results. Is there an automated way, to run the clustering with multiple seed and show the one that yields the lowest total inter-cluster distance?

Thanks

Steve
  • 13
  • 1

2 Answers2

0

Have you tried writing a for loop for this?

That is a great way to automate things!

Has QUIT--Anony-Mousse
  • 76,138
  • 12
  • 138
  • 194
0

You can pass a control parameter with nrep > 1, that will use multiple seeds, and you can choose which result to keep based on your criteria (a list of clustering results will be returned). You still need to pass one "starting" seed to tsclust though.

Documentation here.

Alexis
  • 4,950
  • 1
  • 18
  • 37
  • To clarify, you should still pass a starting seed in order to have reproducible results. If you don't pass a seed, it will still try different seeds, but the starting one will be random each time. – Alexis Mar 15 '19 at 18:27