Get last row of each group in R

Question

I have some data similar in structure to:

a <- data.frame("ID" = c("A", "A", "B", "B", "C", "C"),
                "NUM" = c(1, 2, 4, 3, 6, 9),
                "VAL" = c(1, 0, 1, 0, 1, 0))

And I am trying to sort it by ID and NUM then get the last row. This code works to get the last row and summarize down to a unique ID, however, it doesn't actually get the full last row like I want.

a <- a %>% arrange(ID, NUM) %>%
  group_by(ID) %>%
  summarise(max(NUM))

I understand why this code doesn't work but am looking for the dplyr way of getting the last row for each unique ID

Expected Results:

  ID        NUM     VAL
  <fct    <dbl>    <dbl>
1 A           2       0
2 B           4       1
3 C           9       0

Note: I will admit that though it is nearly a duplicate of Select first and last row from grouped data, the answers on that thread were not quite what I was looking for.

tmfmnk · Answer 1 · 2021-10-29T07:52:57.827

16

One dplyr option could be:

a %>%
 arrange(ID, NUM) %>% 
 group_by(ID) %>% 
 summarise_all(last)

  ID      NUM   VAL
  <fct> <dbl> <dbl>
1 A        2.    0.
2 B        4.    1.
3 C        9.    0.

Or since dplyr 1.0.0:

a %>%
 arrange(ID, NUM) %>% 
 group_by(ID) %>% 
 summarise(across(everything(), last))

Or using slice_max():

a %>%
    group_by(ID) %>%
    slice_max(order_by = NUM, n = 1)

edited Oct 29 '21 at 07:52

answered Dec 04 '18 at 20:20

tmfmnk

38,881
4
47
67

score 15 · Accepted Answer · edited Dec 04 '18 at 20:25

15

You might try:

a %>% 
  group_by(ID) %>% 
  arrange(NUM) %>%  
  slice(n())

edited Dec 04 '18 at 20:25

Bear

662
1
5
20

answered Dec 04 '18 at 20:17

joran

169,992
32
429
468

12b345b6b78 · Answer 3 · 2018-12-04T20:48:43.360

4

tail() returns the last 6 items of a subsettable object. When using aggregate(), the parameters to the FUN argument are passed immediately after the function using a comma; here 1 refers to n = 1, which tells tail() to only return the last item.

aggregate(a[, c('NUM', 'VAL')], list(a$ID), tail, 1)

# Group.1 NUM VAL
# 1       A   2   0
# 2       B   3   0
# 3       C   9   0

edited Dec 04 '18 at 20:48

answered Dec 04 '18 at 20:18

12b345b6b78

995
5
16

2

Would be helpful to wrap this in some explanation of what this does, and why it solves the OPs problem. – SiKing Dec 04 '18 at 20:41

score 0 · Answer 4 · answered Dec 04 '18 at 20:19

You can use top_n. (grouping already sorts by ID, and sorting by NUM isn't necessary since there's only 1 value)

library(dplyr)

a %>% 
  group_by(ID) %>% 
  top_n(1, NUM)

# # A tibble: 3 x 3
# # Groups:   ID [3]
#   ID      NUM   VAL
#   <fct> <dbl> <dbl>
# 1 A         2     0
# 2 B         4     1
# 3 C         9     0

Get last row of each group in R

4 Answers4

Linked

Related