0

I have 2 data frames: dfA and dfB

I would like to be able to extract whole rows from dfB that meet criteria based on dfA

Example:

if (dfA$colA == dfB$colB) && (dfB$colC >= dfA$colD) && 
  (dfB$colC <= dfA$colE) { print rows from dfB }

The values from the 1st column in dfA need to be an exact match for the 2nd column in dfB

AND

the values from column 3 in dfB need to fall within a range set by columns 4 and 5 in dfA.

The output should be the rows from dfB that meet these criteria.

3442
  • 8,248
  • 2
  • 19
  • 41
Bob
  • 295
  • 2
  • 4
  • 14
  • 2
    This sounds like a rolling join. Try looking at: http://stackoverflow.com/questions/24480031/roll-join-with-start-end-window. Bur normally questions are much easier to answer with a [reproducible example](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) with sample input and desired output. – MrFlick Dec 09 '14 at 07:13
  • 1
    I would suggest you provide example data sets and your desired output – David Arenburg Dec 09 '14 at 07:29

1 Answers1

0

I am not sure with R but I guess it must be similar to Pandas: Just create three boolean masks, one for each criteria, than combine those to an overall-mask.

Example: 1stBoolMask = dfB[dfA$colA == dfB$colB] -> Something like ( 0 0 1 1 0 1 0 1 ... ) returns. A "1" stands for every matching entry in dfB.

2ndBoolMask = ...

3rdBoolMask = ...

-> OverallMask = 1stBoolMask & 2ndBoolMask & 3rdBoolMask

Then apply this one to dfB and you should be done. The "1s" in the resulting filter represent the matching lines of dfB.

user2366975
  • 4,350
  • 9
  • 47
  • 87
  • thanks for the help! your suggestion of "1stBoolMask = dfB[dfA$colA == dfB$colB]" and adding a which() command to it has worked a treat! newdf <- dfB[which(dfA$colA==dfB$colB & dfB$colC >= dfA$colD & dfB$colC <= dfA$colE),] – Bob Dec 09 '14 at 08:14