Clustering product into Product Families with Python

Question

I have a dataframe that contains Product ID and Sensors from different stations and Lines of production with values (1: the product passes through the sensor/ or 0: there is no relation between the product and the sensor). Here is a part of the dataframe:

I want to use a clustering methods that can cluster the products in products families according to the process (the sensors).

Thank you for your help

Welcome to StackOverflow. Please include a small sample of your dataframe along with your desired results. Take a look at [how-to-make-good-reproducible-pandas-examples](https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples). — Shubham Sharma, Jun 30 '20 at 11:17

score 0 · Answer 1 · answered Jun 30 '20 at 11:25

Since you do not have labels, we need an unsupervised clustering method.

An example could be Kmeans. Below I provide an example.

import numpy as np
np.random.seed(0)
from sklearn.cluster import KMeans

# build fake data with only 0/1 values in the features
X = np.ones((100,10))
random_indices_rows = np.random.randint(1,100,50) 
X[random_indices_rows]=0

print(X.shape)
#(100, 10) # 100 samples and 10 variables/sensors

# the clustering model
kmeans = KMeans(n_clusters=2, random_state=0).fit(X)
kmeans.labels_

print(kmeans.labels_)

#array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 1, 0, 0, 0, 0, 1, 1,
#       1, 0, 1, 0, 1, 0, 0, 0, 1, 0, 1, 1, 0, 0, 1, 1, 1, 0, 1, 0, 0, 0,
#       0, 1, 0, 1, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 1,
#       1, 0, 1, 0, 1, 1, 0, 1, 0, 1, 1, 0, 1, 0, 1, 1, 1, 1, 1, 0, 0, 0,
#       1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], dtype=int32)

Thank you for your answer. As I see here, I should know exactly how many clusters I want. Is there a way to cluster without a predefined number of clusters ? — Yosr Cheikh, Jun 30 '20 at 13:27
you need to pre-define the number of clusters. the Elbow method is the gold standard way to estimate the best number of clusters — seralouk, Jun 30 '20 at 14:04

Clustering product into Product Families with Python

1 Answers1