-1

So basically I have this dataframe and in this dataframe there is the series 'shape' with the unique values ['cylinder', 'circle', 'light', 'cigar', 'diamond', 'oval', ...] and I want to turn these shapes into numbers so I can use those to make a scatterplot for example.

Is there a way to make another series when each unique shape has its own 'id' as an int?

Edit: Managed to get it working with pandas factorize

code

mok_1
  • 51
  • 5

1 Answers1

0

Try sklean LabelEncoder to convert you categorical columns to Numerical , then you can Plot it

import pandas as pd
df = pd.DataFrame(['cylinder', 'circle', 'light', 'cigar', 'diamond', 'oval'])
df.columns = ['shape']
from sklearn.preprocessing import LabelEncoder
le = LabelEncoder()
le.fit_transform(df)
plt.scatter(df.index , df['shape'])
function
  • 1,298
  • 1
  • 14
  • 41