0

I have a hard time to understand how to fill a dataframe with specific values based on criteria without using for loops. Imagine the following example: I have a dataframe with First Name, Last Name, Education Level and Score. And I have another dataframe that has some of the people mentioned in the 1st dataframe but not the score column. For example the following

df2

A                       B                       C 
Alex                   Student                  Due
George                 Teacher                  Due
Helen                  Student                  Overdue
Maria                  Teacher                  Overdue

What I want to do now is to change the 'Score' column of df1 based on the following criteria:

We look at df2 and

  • If the occupation in B is Student and the value in C is Due then make the Score in df1 to be 3
  • If the occupation in B is Teacher and the value in C is Due then make the Score in df1 to be 5
  • If the occupation in B is Student and the value in C is Overdue then make the Score in df1 to be 2
  • If the occupation in B is Teacher and the value in C is Overue then make the Score in df1 to be 1

I know how to do it with for loops and if since it is very simple to be honest, but I am a bit confused how to take advantage of pandas and do it without for loops(since we want to avoid using for loops etc in Python). My issue is, I know how to filter in pandas but I do not know how to filter AND insert a value in the cell. For example the filtering part I can do it like this

((df2[df2['B'] = 'Student') & (df2[df2['C'] = 'Due'))

but I do not know how to change the value of df1 based on this filtering of df2

Alex
  • 149
  • 8
  • I think need left join. – jezrael Sep 09 '22 at 09:23
  • 2
    You can definitely use left join or merge or something that appears in SQL. But in your scenario, simple loc with conditional statements is enough: `df1.loc[(df2["B"] == "Student") & (df2["C"] == "Due"), "Score"] = 3` – hide1nbush Sep 09 '22 at 09:33

0 Answers0