0

Currently I'm using this code in order to subset unique rows grouped by Loan_ID where Execution_Date is the closest to predetermined date (in my case "2022-03-31")

The example is as follows:

library(dplyr)
df %>%
   group_by(Loan_ID) %>% 
   slice(which.min(abs(Execution_Date - as.Date("2022-03-31")))) %>%
   ungroup

The problem is that if I implement this code is sparklyr I do get an error: "Slice() is not supported on database backends" (this is because no alternative of slice() function is around in SQL)

How can I deal with this problem?

Thank you in advance!

mabramov
  • 161
  • 7
  • Have you seen [this post](https://stackoverflow.com/questions/59217666/dbplyr-dplyr-and-functions-with-no-sql-equivalents-eg-slice)? – Maurits Evers Jun 28 '22 at 04:49
  • try `df |> group_by(Loan_ID) |> filter(Execution_Date == max(Execution_Date))` –  Jun 28 '22 at 04:56
  • Use base R, `by(df, df$Loan_ID, \(x) x[which.min(abs(x$Execution_Date - as.Date("2022-03-31"))), ]) |> do.call(what=rbind)` or `do.call(rbind, by(df, df$Loan_ID, function(x) x[which.min(abs(x$Execution_Date - as.Date("2022-03-31"))), ]))`. – jay.sf Jun 28 '22 at 04:56

0 Answers0