0

I have the following main DataFrame:

food_df.

Fruit
Apple
Tomato
Cranberry
Orange
Papaya
Peach
Pear
Avocado
Kiwi

And previusly I defined some auxiliar DataFrames such as Red_df, Orange_df and Green_df.

I need to create a function that return me the correct auxiliar DataFrame according to the "Food" name input.

For instance, if the Food name is "Apple" OR "Tomato" OR "Cranberry" then I need to get back the Red_df. If the Food name is "Orange" OR "Papaya" OR "Peach", then I need to get back the Orange_df. If the Food name is "Pear" OR "Avocado" OR "Kiwi", then I need to get back the Green_df, and so on.

The following is the code that I've written:

import pandas as pd
data={'Fruit':['Apple', 'Tomato' ,'Cranberry', 'Orange', 'Papaya', 'Peach', 'Pear', 'Avocado', 'Kiwi']}
food_df=pd.DataFrame(data)

data2={'Color':['Red']}
Red_df=pd.DataFrame(data2)

data3={'Color':['Orange']}
Orange_df=pd.DataFrame(data3)

data4={'Color':['Green']}
Green_df=pd.DataFrame(data4)

del data, data2, data3, data4

def function(food):
    print(Color),
    
    if[(food_df["Fruit"]=='Apple') | (food_df["Fruit"]=='Tomato') | (food_df["Fruit"]=='Cranberry')]:
        Color=Red_df
        
    if[(food_df["Fruit"]=='Orange') | (food_df["Fruit"]=='Papaya') | (food_df["Fruit"]=='Peach')]:
        Color=Orange_df
        
    if[(food_df["Fruit"]=='Pear') | (food_df["Fruit"]=='Avocado') | (food_df["Fruit"]=='Kiwi')]:
        Color=Green_df
     
for y in range(0,len(food_df["Fruit"])): 
    food=food_df.loc[y, 'Fruit']
    function(food)

How can I fix the function to get it working?

I can't figure what the problem is.

petezurich
  • 9,280
  • 9
  • 43
  • 57
Aragorn64
  • 149
  • 7

1 Answers1

0

For new column is used instead compare by strings created dictionary, flatten and mapping to new column:

d = {'Red': ['Apple','Tomato','Cranberry'], 
     'Orange': ['Orange','Papaya','Peach'],
     'Green': ['Pear','Avocado','Kiwi']}

d1 = {x: k for k, v in d.items() for x in v}

print (d1)

{'Apple': 'Red', 'Tomato': 'Red', 'Cranberry': 'Red',
 'Orange': 'Orange', 'Papaya': 'Orange', 'Peach': 'Orange',
 'Pear': 'Green', 'Avocado': 'Green', 'Kiwi': 'Green'}

df['Color'] = df['Fruit'].map(d1)
print (df)
       Fruit   Color
0      Apple     Red
1     Tomato     Red
2  Cranberry     Red
3     Orange  Orange
4     Papaya  Orange
5      Peach  Orange
6       Pear   Green
7    Avocado   Green
8       Kiwi   Green

It is not recommended create variables from strings, better is use dict,

fin_dict = dict(tuple(df.groupby('Color')))
print (fin_dict)
{'Green':      Fruit  Color
6     Pear  Green
7  Avocado  Green
8     Kiwi  Green, 'Orange':     Fruit   Color
3  Orange  Orange
4  Papaya  Orange
5   Peach  Orange, 'Red':        Fruit Color
0      Apple   Red
1     Tomato   Red
2  Cranberry   Red}

print (fin_dict['Red'])

But it is possible create DataFrames by groups:

for i, g in df.groupby('Color'):
    globals()[str(i) + '_df'] =  g

print (Red_df)
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252