0

I have a dataset with three columns and thousands of rows as shown below.

The number of classes (clusters) are 4 as shown in column three (R, I, C, F).

row id     VALUE    CLASS  
   1        284         R  
   2        254         I
   3        184         C 
   4        177         F      

..........
  • I am trying to get the cluster plot from the above data based on the 4 classes. The expected output is shown in the picture below.

enter image description here

What I tried: Scatter plot in seaborn

from pandas import read_csv
import seaborn as sns

df2 = read_csv(r'C:\Users\jo\Downloads\Clusters.csv')

sns.scatterplot(data=df2, x="VALUE", y= "rowid",hue="CLASS")

enter image description here

Case Msee
  • 405
  • 5
  • 17

1 Answers1

0

Well, I have to say that the clustering algo is almost certainly doing absolutely what it is supposed to do. Clustering is non-supervised, of course, so you don't have any training/testing and you don't know what the outcome will be. You can feed in different features, and see what the outcome is. Also, you don't really share any code, so it's impossible to say for sure what is going on here. I would suggest taking a look at following links, below, and doing some more Googling on this subject.

https://github.com/ASH-WICUS/Notebooks/blob/master/Clustering%20-%20Historical%20Stock%20Prices.ipynb

https://www.askpython.com/python/examples/plot-k-means-clusters-python

https://towardsdatascience.com/visualizing-clusters-with-pythons-matplolib-35ae03d87489

ASH
  • 20,759
  • 19
  • 87
  • 200
  • The shared links still feeds the input data such as `prices_list` in first link. But as per given suggestion clustering don't have any training/testing ? My question was can plot cluster graphs based on `single dimension data` or single coloumn? We are sure that the plots can be achied with `two dimension data` or 2 coloumns data. – Case Msee May 15 '21 at 12:34
  • I don't think you can do a clustering experiment based on one column. – ASH May 15 '21 at 13:32