I've been trying to iterate through strings in a pandas dataframe to look for a certain set of words and here I've been successful.
However, I realised that I didn't just want to find words but also look at the semantics of a word and group together a certain set of words that bare the same meaning as my main keyword.
I stumbled upon the following question How to return key if a given string matches the keys value in a dictionary which is exactly what I want to do but unfortunately can’t get it to work in a pandas dataframe.
Below is one of the solutions which can be found in the link:
my_dict = {"color": ("red", "blue", "green"), "someothercolor":("orange", "blue", "white")}
solutions = []
my_color = 'blue'
for key, value in my_dict.items():
if my_color in value:
solutions.append(key)
Outputs:
color
My data frame:
Now I have a data frame where I would like to iterate through df[’Name’] to find a value and then I want to add the key to a new column. In this example it would be df[‘Colour']
+---+----------+--------------------------+-----------------------------+----------+--------+
| | SKU | Name | Description | Category | Colour |
+---+----------+--------------------------+-----------------------------+----------+--------+
| 0 | 7E+10 | Red Lace Midi Dress | Red Lace Midi D... | Dresses | |
| 1 | 7E+10 | Long Armed Sweater Azure | Long Armed Sweater Azure... | Sweaters | |
| 2 | 2,01E+08 | High Top Ruby Sneakers | High Top Ruby Sneakers... | Shoes | |
| 3 | 4,87E+10 | Tight Indigo Jeans | Tight Indigo Jeans... | Denim | |
| 4 | 2,2E+09 | T-Shirt Navy | T-Shirt Navy... | T-Shirts | |
+---+----------+--------------------------+-----------------------------+----------+--------+
Expected result:
+---+----------+--------------------------+-----------------------------+----------+--------+
| | SKU | Name | Description | Category | Colour |
+---+----------+--------------------------+-----------------------------+----------+--------+
| 0 | 7E+10 | Red Lace Midi Dress | Red Lace Midi D... | Dresses | red |
| 1 | 7E+10 | Long Armed Sweater Azure | Long Armed Sweater Azure... | Sweaters | blue |
| 2 | 2,01E+08 | High Top Ruby Sneakers | High Top Ruby Sneakers... | Shoes | red |
| 3 | 4,87E+10 | Tight Indigo Jeans | Tight Indigo Jeans... | Denim | blue |
| 4 | 2,2E+09 | T-Shirt Navy | T-Shirt Navy... | T-Shirts | blue |
+---+----------+--------------------------+-----------------------------+----------+--------+
My code:
colour = {'red': ('red', 'rose', 'ruby’), ‘blue’: (‘azure’, ‘indigo’, ’navy')}
def fetchColours(x):
for key, value in colour.items():
if value in x:
return key
else:
return np.nan
df['Colour'] = df['Name'].apply(fetchColours)
I get the following error:
TypeError: 'in <string>' requires string as left operand, not tuple
I can't run a tuple against string. How would I approach this?