19

The code below should group the data by year and then create two new columns with the first and last value of each year.

library(dplyr)

set.seed(123)

d <- data.frame(
    group = rep(1:3, each = 3),
    year = rep(seq(2000,2002,1),3),
    value = sample(1:9, r = T))

d %>% 
    group_by(group) %>%
    mutate(
        first = dplyr::first(value),
        last = dplyr::last(value)
    )

However, it does not work as it should. The expected result would be

  group  year value first  last
  <int> <dbl> <int> <int> <int>
1     1  2000     3     3     4
2     1  2001     8     3     4
3     1  2002     4     3     4
4     2  2000     8     8     1
5     2  2001     9     8     1
6     2  2002     1     8     1
7     3  2000     5     5     5
8     3  2001     9     5     5
9     3  2002     5     5     5

Yet, I get this (it takes the first and the last value over the entire data frame, not just the groups):

  group  year value first  last
  <int> <dbl> <int> <int> <int>
1     1  2000     3     3     5
2     1  2001     8     3     5
3     1  2002     4     3     5
4     2  2000     8     3     5
5     2  2001     9     3     5
6     2  2002     1     3     5
7     3  2000     5     3     5
8     3  2001     9     3     5
9     3  2002     5     3     5
zx8754
  • 52,746
  • 12
  • 114
  • 209
phillyooo
  • 1,523
  • 2
  • 16
  • 22
  • It works for me: I get a column with the first value by group and one with the last value by group. – Jaap Mar 07 '17 at 17:13
  • Could you show the version of `dplyr` – akrun Mar 07 '17 at 17:14
  • 1
    Do you want `summarize` instead of mutate? – Bishops_Guest Mar 07 '17 at 17:16
  • 5
    My guess is a [duplicate of this](http://stackoverflow.com/a/26106218/903061), that you are inadvertently using `plyr::mutate` instead of `dplyr::mutate`. However "*does not work as intended*" is so vague of a description that it's impossible to know... – Gregor Thomas Mar 07 '17 at 17:22
  • thanks all! @Gregor that solved the issue! also, i've updated the question to be more precise wrt expected result vs. actual result. – phillyooo Mar 08 '17 at 14:37

3 Answers3

47

dplyr::mutate() did the trick

d %>% 
    group_by(group) %>%
    dplyr::mutate(
        first = dplyr::first(value),
        last = dplyr::last(value)
    )
phillyooo
  • 1,523
  • 2
  • 16
  • 22
10

You can also try by using summarise function within dpylr to get the first and last values of unique groups

 d %>% 
    group_by(group) %>% 
        summarise(first_value = first(na.omit(values)),
            last_value = last(na.omit(values))) %>% 
               left_join(d, ., by = 'group')
Arun kumar mahesh
  • 2,289
  • 2
  • 14
  • 22
8

If you are from the future and dplyr has stopped supporting the first and last functions or want a future-proof solution, you can just index the columns like you would a list:

> d %>% 
        group_by(group) %>% 
        mutate(
                first = value[[1]], 
                last = value[[length(value)]]
        )
# A tibble: 9 × 5
# Groups:   group [3]
  group  year value first  last
  <int> <dbl> <int> <int> <int>
1     1  2000     3     3     4
2     1  2001     8     3     4
3     1  2002     4     3     4
4     2  2000     8     8     1
5     2  2001     9     8     1
6     2  2002     1     8     1
7     3  2000     5     5     5
8     3  2001     9     5     5
9     3  2002     5     5     5
user438383
  • 5,716
  • 8
  • 28
  • 43