1

I have following data frame :

Claim failure Part csp_rate sml_rate
23     F1     P1    45      65
45     F2     P4    80      56
55     F3     P4    45      82
65     F1     P4    75      77

and another data frame (which has data in columns exactly same as column names of first data frame)

Threshold  weight similarity
70         1      csp_rate
75         2      sml_rate

I want my final data frame to be like following

Claim failure Part csp_rate sml_rate  weight
23     F1     P1    45      65         0
45     F2     P4    80      56         1
55     F3     P4    45      82         2
65     F1     P4    75      77         3

Here weight is sum of weight having similarity (from data frame 1) >= threshold (from data frame 2) e.g. 1,2,3 etc

Can you guys help? stucked at it since long time now.

proposed duplicate question is about joining two dataframes using dplyr::left_join(). The condition to join is less-than, greater-than i.e, <= and > which in my (above) case isn't same.

  • ...almost, but there's the variable column to join on too. If I'm understanding correctly the column to join on in `df1` depends on the `similarity` entry in `df2`, which is more complicated than the proposed duplicate. – Gregor Thomas Feb 25 '19 at 18:46
  • Seems like transforming `df1` to long would reduce the problem to the dupe (and aggregating/transforming back to wide after the non-equi join). – Gregor Thomas Feb 25 '19 at 18:47
  • Yes, Gregor! So basically for columns in df1 (csp_rate and sml_rate) I need to retrieve threshold against these rates which are values in similarity columns and then compute weight. I know its kinda complex to explain :( – Rahul Rajaram Feb 25 '19 at 18:51

0 Answers0