Pandas merge two dataframes based on one column from one table lies in between two columns from another table

Question

I have a dataframe with ip address in one of the column and I want to add a new column called "country" from another dataframe based on the location of ip address that lies between lower and upper ip address.

Two dataframes

import numpy as np
import pandas as pd

df1 = pd.DataFrame({'ip': [0.1,2.5,3.5]})

df2 = pd.DataFrame({'low_ip': [3,2,7,10],
                   'high_ip': [5,3,9,11],
                   'country': ['A','B','A','C']})

print(df1)
    ip
0  0.1
1  2.5
2  3.5

print(df2)
   low_ip  high_ip country
0       3        5       A
1       2        3       B
2       7        9       A
3      10       11       C

Required

ip    country
0.1   NA
2.5   B   because: 2 <= 2.5 <= 3
3.5   A   because: 3 <= 3.5 <= 5

This is just an example, actually dataframes contains lots of other columns as well. — BhishanPoudel, Dec 28 '19 at 15:42
I haven't tested performance or anything, but [this answer](https://stackoverflow.com/a/44601120/2715819) to a pretty much identical question gives another method that might be better for large data sets, although it does re-index `df1` — RishiG, Dec 28 '19 at 16:19

BhishanPoudel · Accepted Answer · 2019-12-28T15:54:03.813

1

Quick and dirty way:

countries = []
for i in range(len(df1)):
    ip = df1.loc[i, 'ip']
    country = df2.query("low_ip <= @ip <= high_ip")['country'].to_numpy()

    if len(country) > 0:
        countries.append(country[0])
    else:
        countries.append('NA')

df1['country'] = countries

print(df1)

    ip country
0  0.1      NA
1  2.5       B
2  3.5       A

edited Dec 28 '19 at 15:54

answered Dec 28 '19 at 15:43

BhishanPoudel

15,974
21
108
169

Pandas merge two dataframes based on one column from one table lies in between two columns from another table

Two dataframes

Required

1 Answers1