Check if value in a dataframe is between two values in another dataframe

Question

I have a pretty similiar question to another question on here.

Let's assume I have two dataframes:

df

volumne
11
24
30

df2

range_low range_high price
10         20         1
21         30         2

How can I filter the second dataframe, based for one row of the first dataframe, if the value range is true? So for example (value 11 from df) leads to:

df3

range_low range_high price
10         20         1

wherelese (value 30 from df) leads to:

df3

I am looking for a way to check, if a specific value is in a range of another dataframe, and filter the dataframe based on this condition. In none python code:

Find 11 in
   (10, 20), if True: df3 = filter on this row
   (21, 30), if True: df3= filter on this row
if not
  return empty frame

just the dataframe, the query for looping over the rows I will write on my own — PV8, Dec 19 '19 at 10:54

score 4 · Accepted Answer · answered Dec 19 '19 at 10:58

For loop solution use:

for v in df['volumne']:
    df3 = df2[(df2['range_low'] < v) &  (df2['range_high'] > v)]
    print (df3)

For non loop solution is possible use cross join, but if large DataFrames there should be memory problem:

df = df.assign(a=1).merge(df2.assign(a=1), on='a', how='outer')
print (df)
   volumne  a  range_low  range_high  price
0       11  1         10          20      1
1       11  1         21          30      2
2       24  1         10          20      1
3       24  1         21          30      2
4       30  1         10          20      1
5       30  1         21          30      2

df3 = df[(df['range_low'] < df['volumne']) &  (df['range_high'] > df['volumne'])]
print (df3)
   volumne  a  range_low  range_high  price
0       11  1         10          20      1
3       24  1         21          30      2

Thx, the line `df3 = df2[(df['range_low] < v & (df2['range_high'] > v )] ` I was looking for. — PV8, Dec 19 '19 at 11:03

score 0 · Answer 2 · answered Dec 29 '20 at 11:03

I have a similar problem (but with date ranges), and if df2 is too large, it will take for ever. If the volumes are always integers, a faster solution is to create an intermediate dataframe where you associate each possible volume to a price (in one iteration) and then merge.

price_list=[]
for index, row in df2.iterrows():
    x=pd.DataFrame(range(row['range_low'],row['range_high']+1),columns=['volume'])
    x['price']=row['price']
    price_list.append(x)
df_prices=pd.concat(price_list)

you will get something like this

    volume  price
0       10      1
1       11      1
2       12      1
3       13      1
4       14      1
5       15      1
6       16      1
7       17      1
8       18      1
9       19      1
10      20      1
0       21      2
1       22      2
2       23      2
3       24      2
4       25      2
5       26      2
6       27      2
7       28      2
8       29      2
9       30      2

then you can quickly associate associate a price to each volume in df

df.merge(df_prices,on='volume')
   volume  price
0      11      1
1      24      2
2      30      2

Check if value in a dataframe is between two values in another dataframe

2 Answers2

Linked