I have a sentence and a corresponding averaged vector for that sentence. The dataset looks like this:
['there are two injuries one is previous'] -0.003632369
['I have motion with mucus from morning'] -0.000631669
['she will be fine with the meds?'] 0.010474829
['can you please suggest some good diet'] 0.008024994
I have around 100K rows. I want to cluster similar sentences together. Any idea on how this can be done. I tried different clustering algos from sklearn but getting similar error like "ValueError: Expected n_neighbors <= n_samples, but n_samples = 1, n_neighbors = 2"
Adding code sample:
dataset = pd.read_csv(r'Documents/clusters.csv')
X = dataset.iloc[:, 1]
X = X.values.reshape(1, -1)
from sklearn.cluster import OPTICS
db = OPTICS(eps=3, min_samples=0).fit(X)