Using multiple columns in dplyr window functions?

Question

Comming from SQL i would expect i was able to do something like the following in dplyr, is this possible?

# R
tbl %>% mutate(n = dense_rank(Name, Email))

-- SQL
SELECT Name, Email, DENSE_RANK() OVER (ORDER BY Name, Email) AS n FROM tbl

Also is there an equivilant for PARTITION BY?

like this `mtcars %>% mutate(n = dense_rank(interaction(cyl, hp)))`? — talat, Jan 19 '18 at 09:05
@docendodiscimus thats awesome, had completly forgotten about `interaction()` i had hashed the values but that messes up the order. Is there any easy solution for using `PARTITION BY`? — CodeMonkey, Jan 19 '18 at 10:31
@CodeMonkey - If you're thinking `PARTITION BY` to get a ranking by group, you would use a `group by` in dplyr. https://stackoverflow.com/questions/34967837/rank-variable-by-group-dplyr — Jason, Jan 19 '18 at 22:06
@Jason Awesome! its working. Using interaction with `lex.order` one can almost simulate the OVER(ORDER BY) and group_by works like a charm. Thanks! — CodeMonkey, Jan 22 '18 at 08:13

Bảo Trần · Accepted Answer · 2019-07-23T13:33:00.693

0

I did struggle with this problem and here is my solution:

In case you can't find any function which supports ordering by multiple variables, I suggest that you concatenate them by their priority level from left to right using paste().

Below is the code sample:

tbl %>%
  mutate(n = dense_rank(paste(Name, Email))) %>%
  arrange(Name, Email) %>%
  view()

Moreover, I guess group_by is the equivalent for PARTITION BY in SQL.

The shortfall for this solution is that you can only order by 2 (or more) variables which have the same direction. In the case that you need to order by multiple columns which have different direction, saying that 1 asc and 1 desc, I suggest you to try this: Calculate rank with ties based on more than one variable

edited Jul 23 '19 at 13:33

answered Jul 19 '19 at 01:20

Bảo Trần

130
1
8

The example you propose differs from the one asked. Try updating it to match the example provided by the question creator. – guzmonne Jul 19 '19 at 01:44
Hi Bao if you fix your answer i can set it as accepted. – CodeMonkey Jul 22 '19 at 14:11

Using multiple columns in dplyr window functions?

1 Answers1

Linked