How to cluster a time series using KMeans in python

Question

So I have a data in the form [UID obj1 obj2..] x timestamp and I want to cluster this data in python using kmeans from sklearn. Where should I start?

EDIT:

So basically I'm trying to cluster users based on clickstream data, and classify them based on usage patterns.

Could you [create a Minimal, Complete, and Verifiable example](http://stackoverflow.com/help/mcve)? — Anton Protopopov, Feb 09 '16 at 04:38
Could you give an example of what you are trying to achieve? — Neil, Feb 09 '16 at 06:46
Duplicate question: http://stackoverflow.com/questions/3503668/how-to-cluster-time-series-data-using-k-means-algorithm — pavel, Feb 09 '16 at 08:57
sci-kit has great implementations of [k-means](http://scikit-learn.org/stable/modules/generated/sklearn.cluster.KMeans.html#sklearn.cluster.KMeans) and other clustering algorithms — America, Feb 17 '16 at 17:02

score 0 · Answer 1 · answered Nov 02 '17 at 02:28

0

You can add more features based on the raw data, and using methods like RFM Analysis. RFM = recency, frequency, monetary

For example:

How often the user logged in?

The last time the user logged in?

answered Nov 02 '17 at 02:28

kingbase

1,268
14
23

score 0 · Answer 2 · answered Oct 10 '20 at 20:39

You can use Python library Retentioneering (github), which allows you to cluster your users based on clickstream data with a simple command. You can also specify any target events you are interested in your clusters and explore obtained graphs using interactive graphs.

data.rete.get_clusters(method='kmeans',
                   feature_type='tfidf',
                   n_clusters=8,
                   ngram_range=(1,2),
                   plot_type='cluster_bar',
                   targets=['payment_done','cart']);

results of user clustering

Next you can explore obtained behavioral clusters with interactive graph:

clus_0 = data.rete.filter_cluster(0)
clus_0.rete.plot_graph(thresh=0.1,
                   weight_col='user_id',
                   targets = {'lost':'red',
                              'payment_done':'green'})

graph visualization example

How to cluster a time series using KMeans in python

2 Answers2