may I know how to convert text/string data to numbers for a column in Dataframe ?
If the same text/string appear again, they should return the same number.
Looking for a general way to convert since there are thousands of fruit in the world
Example :
Fruit Number (expected outcome)
1 Apple 1
2 Orange 2
3 Apple 1
4 Banana 3
5 Blackberries 4
6 Avocado 5
7 Grapes 6
8 Orange 2
9 Apple 1
10 Mango 7
. . .
. . .
. . .
Asked
Active
Viewed 508 times
0

sirimiri
- 509
- 2
- 6
- 18
-
Hi Sirimiri - See this [link](https://stackoverflow.com/questions/57884039/oracle-sql-convert-string-to-number-with-exceptions-to-treat-text-as-0). It should answer your question. – wolfblitza Dec 08 '19 at 03:38
-
Use `df['Number'] = pd.factorize(df.Fruit)[0] + 1` – jezrael Dec 08 '19 at 08:26
1 Answers
1
import pandas as pd
fruitList={'name':[ "Apple","Orange","Apple","Banana","Blackberries","Avocado","Grapes","Orange","Apple","Mango"] }
df = pd.DataFrame(fruitList)
# get distinct fruit names
unique=df.name.unique()
# generating a dictionary based on Id of unique fruit names using list comprehension
dict={ x:index+1 for index, x in enumerate(unique) }
# assigning new column 'Id' values from the dictionary using the map function
df['Id'] = df["name"].map(dict)
print(df)
The Output is :
name Id
0 Apple 1
1 Orange 2
2 Apple 1
3 Banana 3
4 Blackberries 4
5 Avocado 5
6 Grapes 6
7 Orange 2
8 Apple 1
9 Mango 7

Mehrdad Dowlatabadi
- 1,335
- 2
- 9
- 11