1

I am working with the R programming language.

I have the following dataset:

id = c(1,2,3,4,5,6)
group_1 = c("a", "a", "b", "b", "b", "b")
var_1 = c(12,32,14,17,14,18)
my_data = data.frame(id, group_1, var_1)

 id group_1 var_1
1  1       a    12
2  2       a    32
3  3       b    14
4  4       b    17
5  5       b    14
6  6       b    18

Within each "group_1", for each unique value of "var_1" (in ascending order), I am trying to assign a unique value.

The final output should look something like this:

 id group_1 var_1 var_2
1  1       a    12    g1
2  2       a    32    g2
3  3       b    14    g1
4  4       b    17    g2
5  5       b    14    g1
6  6       b    18    g3

I tried to do this with the following code:

library(dplyr)

my_data[order(my_data$group_1, my_data$var_1),]

my_data %>%                                        
    group_by(group_1) %>%
    dplyr::mutate(ID = cur_group_id())

But this is not producing the correct output:

# A tibble: 6 x 4
# Groups:   group_1 [2]
     id group_1 var_1    ID
  <dbl> <chr>   <dbl> <int>
1     1 a          12     1
2     2 a          32     1
3     3 b          14     2
4     4 b          17     2
5     5 b          14     2
6     6 b          18     2

Can someone please show me what I am doing wrong?

zx8754
  • 52,746
  • 12
  • 114
  • 209
stats_noob
  • 5,401
  • 4
  • 27
  • 83

1 Answers1

2

Your attempt is very close! Instead of cur_group_id(), which gives you an unique identifier for each group, you could use dense_rank() from dplyr:

my_data |> 
  group_by(group_1) |> 
  mutate(
    id = paste0("g", dense_rank(var_1))
  )

#> # A tibble: 6 × 3
#> # Groups:   group_1 [2]
#>   id    group_1 var_1
#>   <chr> <chr>   <dbl>
#> 1 g1    a          12
#> 2 g2    a          32
#> 3 g1    b          14
#> 4 g2    b          17
#> 5 g1    b          14
#> 6 g3    b          18

Created on 2023-01-05 with reprex v2.0.2

Peter H.
  • 1,995
  • 8
  • 26