assign cluster membership to new data using kmodes

Question

Looking at this code from here:

import numpy as np
from kmodes.kmodes import KModes

# random categorical data
data = np.random.choice(20, (100, 10))

km = KModes(n_clusters=4, init='Huang', n_init=5, verbose=1)
clusters = km.fit_predict(data)

# Print the cluster centroids
print(km.cluster_centroids_)

Does anyone happen to know how to save the "clustering model" and apply it to new data? Or in other words cluster previously unseen data? Thanks.

score 2 · Accepted Answer · answered Feb 10 '22 at 16:04

2

You can use pickle for this task.

import pickle

with open('cluster_model.pickle', 'wb') as n:
    pickle.dump(km, n)

When you want to use it on new data, simply:

with open('cluster_model.pickle', 'rb') as f:
    km = pickle.load(f)

# If your new data is called "new_data", you can:
new_clusters = km.predict(new_data)

answered Feb 10 '22 at 16:04

artemis

6,857
11
46
99

thanks. yes I knew about pickle. so the predict method would work. does this also work for KPrototypes do you reckon? I will try soon ... – cs0815 Feb 10 '22 at 16:15
If this solves your problem, don't forget to mark as correct to help others in the future @cs0815 – artemis Feb 14 '22 at 20:59

assign cluster membership to new data using kmodes

1 Answers1