-1

I have this mapping table.

fli.1 = c(0, 700, 1500, 2000, 3500, 5600, 11000, 20000, 50000)
fli.2 = c(0, 200, 1000, 2900, 3000, 5500, 10000, 20000, 50000)
fli.3 = c(0, 600, 1200, 2500, 3900, 5000, 10000, 22000, 58000)
fli.4 = c(0, 300, 1000, 2000, 3000, 5000, 14000, 20000, 50000)
fli.5 = c(0, 500, 1000, 2000, 3500, 5000, 10000, 20000, 50000)

df <- data.frame(xt    = c(rep("a",9),rep("b",9),rep("c",9)),
                 fli.1 = rep(fli.1,3),
                 fli.2 = rep(fli.2,3),
                 fli.3 = rep(fli.3,3),
                 fli.4 = rep(fli.4,3),
                 fli.5 = rep(fli.5,3));

and another dataframe which is.


d <- data.frame(xt=rep(c("a","b","c"),4),value = c(0, 2000, 5000, 6500, 40000, 60000, 400, 200, 40, 7899, 1000, 1500), fli = c("fli.1", "fli.1", "fli.4", "fli.5", "fli.2", "fli.2", "fli.5", "fli.1", "fli.2", "fli.2", "fli.3", "fli.4"));

Requirement is creating additional column to the right of dataframe d. such that.

d$value is 0,2000,5000,5000,20000, ... d$row is 1,13,24,6,17, ...

for e.g. first row in d is for (a, 0 ,fli.1) so look in fli.1 column and for a and the value should be less than or equal to 0. which would be (0,1)

another example for fifth row in d which is (b,40000,fli.2) so look in fli.2 column and for b and the value less than 40000. answer would be (20000,17).

this is just a representation with less data in actual the d dataframe is around 2million. Also the in the table df values are not repeated for a,b and c. it can be different. I am looking for a code from data.table package.

Mohit
  • 51
  • 4
  • 2
    Please explain the differences between this question and your [last](https://stackoverflow.com/q/76922579/3358272) [two](https://stackoverflow.com/q/76923642/3358272) questions. – r2evans Aug 18 '23 at 15:35
  • 1
    this has one more classification a, b and c. – Mohit Aug 18 '23 at 15:37
  • r2evans, I loved your code in the past. if something similiar can be done here. please help. Hope you understand what I am trying to ask here. I tried making changes to code which you posted earlier to achieve the requirement but couldnt do. – Mohit Aug 18 '23 at 15:41
  • 1
    Two things: (1) your questions look nearly identical, so please when you spiral this dev into another question, be _explicit_ about the differences, and how the answer(s) from previous questions did not work. (2) You aren't [accepting](https://stackoverflow.com/help/someone-answers) many answers of late, is there a reason? – r2evans Aug 18 '23 at 15:43
  • I am new to this platform. let me do it right away. – Mohit Aug 18 '23 at 15:47
  • 1
    You've been on the platform for over 3 years, and you had accepted answers in the beginning and recently. I felt it was safe to interpret that as intentionally holding back, and for clarity, _that is your right_. Pair it with asking nearly-identical questions, though, it starts to look like an anti-pattern. I understand now that it was not intentional, my apologies for the "slap". (I encourage you to look at all of the [other questions](https://stackoverflow.com/users/13103354/mohit?tab=questions) you've asked and accept one from each, not just my answers. Others deserve credit, too.) – r2evans Aug 18 '23 at 16:06
  • 1
    No worries r2evans however I dont know what exactly you mean by slap. But yes 3 years ago I was using R and only now recently around 6 months ago I started using it heavily. as per your comments this is what I will do now. look back and mark answered for all my asked questions or write comments if not. Do you have any other concerns with me? – Mohit Aug 18 '23 at 16:18
  • Not at all! Thanks for the discourse Mohit! (There are a few reasons I'm being careful here: asking to accept "my answer" is self-serving, that's not my intent, and it is actively discouraged; asking why a user chooses to not accept answers in general can be acceptable, and my intent is to inform or remind users that the "norm" on Stack sites is to accept and optionally vote when answers warrant, neither is required but both strongly encouraged by the site. Me reminding you what you had once known could have been a "slap", that was my reasoning and why the apology. Thanks!) – r2evans Aug 18 '23 at 16:25

1 Answers1

1

New approach: melt df, convert both to data.table, and do a non-equi join.

library(data.table)
setDT(df)
df[, i := .I] # needed to preserve the original row reference
dfmelted <- melt(df, id.vars = c("xt", "i"), variable.name = "fli", variable.factor = FALSE, value.name = "dfval")
setorder(dfmelted, xt, i, fli, dfval)
dfmelted
#          xt     i    fli dfval
#      <char> <int> <char> <num>
#   1:      a     1  fli.1     0
#   2:      a     1  fli.2     0
#   3:      a     1  fli.3     0
#   4:      a     1  fli.4     0
#   5:      a     1  fli.5     0
#   6:      a     2  fli.1   700
#   7:      a     2  fli.2   200
#   8:      a     2  fli.3   600
#   9:      a     2  fli.4   300
#  10:      a     2  fli.5   500
#  ---                          
# 126:      c    26  fli.1 20000
# 127:      c    26  fli.2 20000
# 128:      c    26  fli.3 22000
# 129:      c    26  fli.4 20000
# 130:      c    26  fli.5 20000
# 131:      c    27  fli.1 50000
# 132:      c    27  fli.2 50000
# 133:      c    27  fli.3 58000
# 134:      c    27  fli.4 50000
# 135:      c    27  fli.5 50000

Now that we have this, we can do a non-equi merge/join1:

setDT(d)
d[dfmelted, c("i", "val") := .(i.i, i.dfval), on = .(xt, fli, value >= dfval)]
#         xt value    fli     i   val
#     <char> <num> <char> <int> <num>
#  1:      a     0  fli.1     1     0
#  2:      b  2000  fli.1    13  2000
#  3:      c  5000  fli.4    24  5000
#  4:      a  6500  fli.5     6  5000
#  5:      b 40000  fli.2    17 20000
#  6:      c 60000  fli.2    27 50000
#  7:      a   400  fli.5     1     0
#  8:      b   200  fli.1    10     0
#  9:      c    40  fli.2    19     0
# 10:      a  7899  fli.2     6  5500
# 11:      b  1000  fli.3    11   600
# 12:      c  1500  fli.4    21  1000

Notes:

  1. Merge/join can be simple, but the range-based non-equality join gets a little more complicated. Some links that discuss the basic concepts: How to join (merge) data frames (inner, outer, left, right), What's the difference between INNER JOIN, LEFT JOIN, RIGHT JOIN and FULL JOIN?, (data.table) Left join using data.table
r2evans
  • 141,215
  • 6
  • 77
  • 149
  • this doesnt gives the solution as desired. I need the additional columns in d as 0,2000,5000,5000,20000... 1,13,24,6,17... – Mohit Aug 18 '23 at 15:47
  • Once this is solved then I will mark it answered. – Mohit Aug 18 '23 at 15:55
  • @Mohit, see my edit. – r2evans Aug 18 '23 at 16:03
  • > dfmelted <- melt(df, id.vars = c("xt", "i"), variable.name = "fli", variable.factor = FALSE, value.name = "dfval") Error in melt.data.table(df, id.vars = c("xt", "i"), variable.name = "fli", : One or more values in 'id.vars' is invalid. – Mohit Aug 18 '23 at 16:07
  • oops, stby ...... missed a code paste, see my edit. Long story short, run `df[, i := .I] ` before melting. – r2evans Aug 18 '23 at 16:08