19

I have a question about kmeans clustering in python.

So I did the analysis that way:

from sklearn.cluster import KMeans

km = KMeans(n_clusters=12, random_state=1)
new = data._get_numeric_data().dropna(axis=1)
km.fit(new)
predict=km.predict(new)

How can I add the column with cluster results to my first dataframe "data" as an additional column? Thanks!

dkhara
  • 695
  • 5
  • 18
Keithx
  • 2,994
  • 15
  • 42
  • 71
  • 2
    so you are essentially asking how to add a column to a dataframe? Such as in: http://stackoverflow.com/questions/12555323/adding-new-column-to-existing-dataframe-in-python-pandas or here http://stackoverflow.com/questions/18942506/add-new-column-in-pandas-dataframe-python – Nikolas Rieble Jul 14 '16 at 10:52

1 Answers1

25

Assuming the column length is as the same as each column in you dataframe df, all you need to do is this:

df['NEW_COLUMN'] = pd.Series(predict, index=df.index)
desertnaut
  • 57,590
  • 26
  • 140
  • 166
Gal Dreiman
  • 3,969
  • 2
  • 21
  • 40