3

How can we use a FOR LOOP to match the color from df_students to the color in df_colors and then fill in the corresponding fruit and corresponding fruit_id for each student in df_students?

import pandas as pd
df_colors = pd.DataFrame({'fruit_id':[101, 102, 103, 104, 105, 106, 107, 108, 109],
                       'fruit':['apple','banana','dragonfruit','kiwi','plum','lime', 'blackberry', 'blueberry', 'guava'],
                       'color':['red', 'yellow', 'magenta', 'brown', 'purple', 'green', 'black', 'blue', 'pink']})

df_students = pd.DataFrame({'student':['Jamie', 'Tao', 'Ingrid', 'Will', 'Boris','Xavier','Nancy', 'Judith', 'Lamar', 'Francis', 'Shawna', 'Carlos', 'Morgan'],
                        'color': ['black', 'red', 'magenta', 'yellow','black', 'magenta', 'brown', 'purple', 'magenta', 'green', 'blue', 'pink', 'pink']})


df_students['fruit'] = ''
df_students['fruit_id'] = ''
for eachstudent in df_students['color']:
    for acolor in df_colors['color']:
        if eachstudent == acolor:
            df_students['fruit'] = df_colors['fruit']
            df_students['fruit_id'] = df_colors['fruit_id']
df_students

This output is incorrect!

WRONG RESULT

BrianBeing
  • 431
  • 2
  • 4
  • 12
  • Your data doesnt make sense? How is black matched with apple if apple has color red in `df_colors` and same for red and banana? – Erfan Mar 02 '19 at 18:47
  • @Erfan the code above has a wrong output. You're probably right, I could have indicated this. Thanks for the suggestion! – BrianBeing Mar 02 '19 at 19:13
  • @Guy_Fuqua i have updated with both solutions, one for for loop (for merge) and the other one which is displayed on the question. – anky Mar 02 '19 at 19:55

3 Answers3

2
import pandas as pd

df_colors = pd.DataFrame({'fruit_id':[101, 102, 103, 104, 105, 106, 107, 108, 109],
                   'fruit':['apple','banana','dragonfruit','kiwi','plum','lime', 'blackberry', 'blueberry', 'guava'],
                   'color':['red', 'yellow', 'magenta', 'brown', 'purple', 'green', 'black', 'blue', 'pink']})

df_students = pd.DataFrame({'student':['Jamie', 'Tao', 'Ingrid', 'Will', 'Boris','Xavier','Nancy', 'Judith', 'Lamar', 'Francis', 'Shawna', 'Carlos', 'Morgan'],
                    'color': ['black', 'red', 'magenta', 'yellow','black', 'magenta', 'brown', 'purple', 'magenta', 'green', 'blue', 'pink', 'pink']})
df_students['fruit'] = ''
df_students['fruit_id'] = ''

for acolor1 in df_colors['color']: 
    df_students.loc[df_students['color']==acolor1,'fruit']= list(df_colors.loc[df_colors['color']==acolor1,'fruit'])[0]
    df_students.loc[df_students['color']==acolor1, 'fruit_id'] = list(df_colors.loc[df_colors['color']==acolor1, 'fruit_id'])[0]
print (df_students)
1

How about this:

for num1,eachstudent in enumerate(df_students['color']):
    for num2,acolor in enumerate(df_colors['color']):
        if eachstudent == acolor:
            df_students['fruit'].values[num1] = df_colors['fruit'].values[num2]
            df_students['fruit_id'].values[num1] = df_colors['fruit_id'].values[num2]
Lobsterguy
  • 93
  • 7
1

You simply want to perform a merge, you dont need 'for loop' for that.

Please have a look at Pandas Merging 101

The solution you are looking for:

df_students.merge(df_colors, on='color', how='left')


    student color   fruit_id    fruit
0   Jamie   black   107         blackberry
1   Tao     red     101         apple
2   Ingrid  magenta 103         dragonfruit
3   Will    yellow  102         banana
4   Boris   black   107         blackberry
5   Xavier  magenta 103         dragonfruit
6   Nancy   brown   104         kiwi
7   Judith  purple  105         plum
8   Lamar   magenta 103         dragonfruit
9   Francis green   106         lime
10  Shawna  blue    108         blueberry
11  Carlos  pink    109         guava
12  Morgan  pink    109         guava

Like I said, the expected output you gave is incorrect if you want to match on the color column in both dataframes.

Erfan
  • 40,971
  • 8
  • 66
  • 78
  • You're absolutely right. I had the merge solution, I was interested in the for loop solution specifically for a project. – BrianBeing Mar 02 '19 at 19:17