2

Say that I have these data:

data <- data.frame(orig=c(1,5,5,5,14,18,18,25))

  orig
1    1
2    5
3    5
4    5
5   14
6   18
7   18
8   25

I would like to create the want column:

  orig want
1    1    1
2    5    5
3    5    6
4    5    7
5   14   14
6   18   18
7   18   19
8   25   25

This column takes orig and copies its value, but breaks ties if they exist. What I am trying to do is to re-create the rankings so that there are no ties and the ties are broken based on the order of the rows in the dataset. If not for the spaces in the rankings (jump from 1 to 5, etc.), I could use

library(tidyverse)
data %>% mutate(test = rank(orig, ties.method="min"))

But this of course doesn't get me what I want:

  orig test
1    1    1
2    5    2
3    5    2
4    5    2
5   14    5
6   18    6
7   18    6
8   25    8

What can I do?

ThomasIsCoding
  • 96,636
  • 9
  • 24
  • 81
bill999
  • 2,147
  • 8
  • 51
  • 103
  • 1
    Related: [Increment by one to each duplicate value](https://stackoverflow.com/questions/43196718/increment-by-one-to-each-duplicate-value). Can you have something like `c(1, 1, 1, 2)`? If not, you find the answers in the link. – Henrik Aug 16 '21 at 21:52

2 Answers2

3

We may add row_number() after grouping

library(dplyr)
data %>%
     group_by(orig) %>% 
     mutate(want = orig + row_number() - 1) %>%
     ungroup

-ouptut

# A tibble: 8 x 2
   orig  want
  <dbl> <dbl>
1     1     1
2     5     5
3     5     6
4     5     7
5    14    14
6    18    18
7    18    19
8    25    25

Or may simplify with rowid from data.table

library(data.table)
data %>% 
     mutate(want = orig + rowid(orig)-1)
akrun
  • 874,273
  • 37
  • 540
  • 662
2

A base R option using ave + seq_along

transform(
  data,
  want = orig + ave(orig, orig, FUN = seq_along) - 1
)

gives

  orig want
1    1    1
2    5    5
3    5    6
4    5    7
5   14   14
6   18   18
7   18   19
8   25   25
ThomasIsCoding
  • 96,636
  • 9
  • 24
  • 81