I am a bit confused about Clustering e.g. K-means clustering. I have already created clusters for the training for and in the testing part I want to know if the new points are already in the clusters or if they can be in the cluster or not? My idea is to find the center of each cluster and also find the farthest point in each cluster in training data then in testing part if the distance of the new point is great than a threshold (e.g. 1.5x the farthest point) then it cannot be in the cluster!
Is this idea efficient and correct and is there any python function to do this?
One more question: Could someone help me to understand the difference between kmeans.fit() and kmeans.predict()? I get the same result in both functions!!
I appreciate any help