0

may I know how to convert text/string data to numbers for a column in Dataframe ? If the same text/string appear again, they should return the same number. Looking for a general way to convert since there are thousands of fruit in the world Example :
Fruit Number (expected outcome)
1 Apple 1
2 Orange 2
3 Apple 1
4 Banana 3
5 Blackberries 4
6 Avocado 5
7 Grapes 6
8 Orange 2
9 Apple 1
10 Mango 7
. . . . . . . . .

sirimiri
  • 509
  • 2
  • 6
  • 18
  • Hi Sirimiri - See this [link](https://stackoverflow.com/questions/57884039/oracle-sql-convert-string-to-number-with-exceptions-to-treat-text-as-0). It should answer your question. – wolfblitza Dec 08 '19 at 03:38
  • Use `df['Number'] = pd.factorize(df.Fruit)[0] + 1` – jezrael Dec 08 '19 at 08:26

1 Answers1

1
import pandas as pd 

fruitList={'name':[ "Apple","Orange","Apple","Banana","Blackberries","Avocado","Grapes","Orange","Apple","Mango"] }
df = pd.DataFrame(fruitList) 

# get distinct fruit names
unique=df.name.unique()
# generating a dictionary based on Id of unique fruit names using list comprehension
dict={ x:index+1 for index, x in enumerate(unique) }
# assigning new column 'Id' values from the dictionary using the map function 
df['Id']  = df["name"].map(dict)
print(df)

The Output is :

        name      Id
0         Apple   1
1        Orange   2
2         Apple   1
3        Banana   3
4  Blackberries   4
5       Avocado   5
6        Grapes   6
7        Orange   2
8         Apple   1
9         Mango   7
Mehrdad Dowlatabadi
  • 1,335
  • 2
  • 9
  • 11