1

I have a replacing problem in python. I'm trying to replace all the integer value in the column ORIGIN_AIRPORT from df_2 with the IATA column from df_1. The key-columns are ORIGIN_AIRPORT from df_2 and ID from df_1.

df_2 has about 5.800.000 datasets and df_1 about 6.400.

I think there is an easy way to do it, but I don't know how.

df_1:

    ID      IATA
0   10001   01A
1   10003   03A
2   10004   04A
3   10005   05A
4   10006   06A
.
.
.

df_2:

        YEAR    MONTH   DAY DAY_OF_WEEK AIRLINE FLIGHT_NUMBER   TAIL_NUMBER ORIGIN_AIRPORT  DESTINATION_AIRPORT 
4385712 2015    10      1   4           AA      1230            N3DBAA      SEA             ANC 
4385713 2015    10      1   4           DL      1805            N696DL      10001           13487   
4385714 2015    10      1   4           NK      612             N602NK      10006           13487   
.
.
.

At the end the df shoul look like df_3.

df_3:

        YEAR    MONTH   DAY DAY_OF_WEEK AIRLINE FLIGHT_NUMBER   TAIL_NUMBER ORIGIN_AIRPORT  DESTINATION_AIRPORT 
4385712 2015    10      1   4           AA      1230            N3DBAA      SEA             ANC 
4385713 2015    10      1   4           DL      1805            N696DL      01A             13487   
4385714 2015    10      1   4           NK      612             N602NK      06A             13487   
.
.
.

2 Answers2

2

I am going to write the simplest solution for this :

Step 1: Convert both columns from df_1 into a dictionary by using the following code:

d = dict(zip(df_1.ID,df_1.IATA))

Step 2: Now we just need to map this dictionary and df_2:

df_2.ORIGIN_AIRPORT= df_1.ID.map(d)
Anshul Vyas
  • 633
  • 9
  • 19
0

You can do a left join of df_1 and df_2.

Given the moderate size of df1, you can just create a mapping rule:

mapping = {}
for row in df_1.iterrows():
    mapping[row[0]] = row[1]

And create a new column:

df2['AIRPORT_PROCESSED'] = df2['ORIGIN_AIRPORT'].apply(lambda x: mapping.get(x, x))

The last function will replace ORIGIN_AIRPORT value for the value from mapping if x is in mapping.

Kate Melnykova
  • 1,863
  • 1
  • 5
  • 17