0

In a data.frame containing information on two parameters (date and station), I would like to label each unique combination of the two in a new column.

What I have:

df
       date station  
1    april     GF3   
2 december     GF1    
3    april     GF2   
4    april     GF3     
5 december     GF1

What I want:

df2
       date station   Label
1    april     GF3      1
2 december     GF1      2 
3    april     GF2      3
4    april     GF3      1  
5 december     GF1      2 

Thanks!

3 Answers3

1

Paste the values together and use match + unique to create unique group number.

vals <- paste(df$date, df$station)
df$label <- match(vals, unique(vals))

#      date station label
#1    april     GF3     1
#2 december     GF1     2
#3    april     GF2     3
#4    april     GF3     1
#5 december     GF1     2

If the numbering of label is not important you can also use cur_group_id() in dplyr.

library(dplyr)
df %>% group_by(date, station) %>% mutate(label = cur_group_id()) %>% ungroup
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213
1

A dense_rank will also do

df %>% mutate(Label = dense_rank(paste(date, station)))

      date station Label
1    april     GF3     2
2 december     GF1     3
3    april     GF2     1
4    april     GF3     2
5 december     GF1     3

It will however, give preference to number alphabetically

AnilGoyal
  • 25,297
  • 4
  • 27
  • 45
0

A dplyr approach with a left_join:

d <- tribble(~date, ~station,
             "april","GF3",
             "december","GF1",    
             "april","GF2",   
             "april","GF3", 
             "december","GF1")

d %>% left_join(
  d %>% distinct(date, station) %>% 
    rowid_to_column(),
  by = c("station", "date")
)

Which results in:

  date     station rowid
  <chr>    <chr>   <int>
1 april    GF3         1
2 december GF1         2
3 april    GF2         3
4 april    GF3         1
5 december GF1         2
MKR
  • 1,620
  • 7
  • 20