Subset data frame based on range of values in second data frame

Question

I am trying to create a subset of a data frame based on a range surrounding the values of a second data frame, I've been researching but I just cannot figure out how to go about it. I've used dummy data here as they are both large datasets with many columns.

Data Frame 1 (df1) has 50 columns, thousands of recordings at different Latitudes

Recording	Latitude
BombusL	51.41
ApisM	51.67
BombusR	51.34

Data Frame 2 (df2) has several hundred towns all at different latitudes, it is significantly smaller than df1

Town	Lat
Bristol	51.40
Merton	51.42
Horsham	51.33

I need a subset of df1 which only includes rows with latitudes that are within 0.01 of a latitude in df2. So the code needs to look down every row of df1 and test that number against every row of df2. The output would include only rows from df1 where the latitude value is within 0.01 range of a value in df2$Latitude.

From the example, the following lines would be included

Recording	Latitude
BombusL	51.41
BombusR	51.34

I have the start of the code to do a filter that I could then run through the data frame to create the subset

LatFil <- df1$latitude %in% df2$latitude)

But I can't figure out how to enter the logical test of ± 0.01 of the value in df2$latitude

akrun · Accepted Answer · 2021-04-28T18:54:53.787

4

When there is precision involved (i.e. adding or subtracting 0.01, it is a floating point number), it may be better to use comparison operators instead of fixed matching

subset(df1, (Latitude >= (df2$Lat - 0.01)) & 
         (Latitude <= (df2$Lat + 0.01)))

edited Apr 28 '21 at 18:54

answered Apr 28 '21 at 18:26

akrun

874,273
37
540
662

score 2 · Answer 2 · answered Apr 28 '21 at 18:32

2

Another option:

df2$Lat_hi <- df2$Lat + 0.01
df2$Lat_lo <- df2$Lat - 0.01


LatFil <- df1[df1$Latitude %in% c(df2$Lat, df2$Lat_hi, df2$Lat_lo),]

answered Apr 28 '21 at 18:32

Alec B

159
3

Subset data frame based on range of values in second data frame

2 Answers2

Linked