I am trying to cluster time series data in Python using different clustering techniques. K-means didn't give good results. The following images are what I have after clustering using agglomerative clustering. I also tried Dynamic Time warping. These two seem to give similar results.
What I would ideally like to have is two different clusters for the time series in the second image. The first image is a cluster for rapid increases. The second for no increase kind of like stable and the third is a cluster for decreasing trends. I would like to know which time series are stable as well as popular (by popular here, I mean high count). I tried hierarchical clustering but the results showed way too many hierarchies and I am not sure how to pick the level of hierarchy. Can someone shed light on how to go about splitting the time series in the second image into two different clusters, one with low counts and the other with high counts? Is it possible to do it? Or should I just visually pick a threshold to cut them into two?
Cluster with rapid increases:
Cluster with stable counts:
Cluster with decreasing trends:
This is very very vague but this is the result of my hierarchical clustering.
I know this particular image is not useful at all but this is like a dead end for me as well.
In general, if you would like to differentiate trends, say for instance for YouTube videos, how do only some get picked up for the "trending" section and some others for "trending this week" section? I understand the "trending" section videos are the ones that show similar characteristics to the first image. "Trending this week" section has a collection of videos which have very high view counts but are quiet stable in terms of counts (i.e. not showing rapid increases). I know that in case of YouTube, there are many many other factors that are considered in addition to just view counts. With the second image, what I am trying to do is similar to "trending this week" section. I would like to pick the ones that have very high counts. How do I split the time series in this case?
I know DTW captures trends. DTW gave the same results as the above images. It has identified the trend in the second image which is "stable". But it doesn't capture the "count" element here. I want both the trend as well as the count to be captured , in this case stable and high count.
The above images are time series clustered based on counts. Am I missing out on any other clustering techniques that could achieve this? Even with just counts, how do I cluster differently according to my needs?
Any ideas would be much appreciated. Thanks in advance!