1

I have a dataset that looks like this

ID|Type
1  "Basketball"
1  "Baseball"
2  "Basketball"
2  "Football"
3  "Boxing"
4  "Boxing"
4  "Wrestling"
4  "Handball"
4  "Hockey"

I would like to create a dataset that looks like this

ID|        Type|observation
1  "Basketball" 1      
1  "Baseball"   2
2  "Basketball" 1
2  "Football"   2
3  "Boxing"     1
4  "Boxing"     1
4  "Wrestling"  2
4  "Handball"   3
4  "Hockey"     4

I'm stuck after this part and tried doing this

 df %>% 
 group_by(ID) %>%
 1:nrow(df)
user35131
  • 1,105
  • 6
  • 18

2 Answers2

2

We can use row_number()) (it was first posted here before we edited, only issue was the grouping included Type and didn't test it earlier)

library(dplyr)    
df %>%
    group_by(ID) %>%
    mutate(observation = match(Type, unique(Type))) %>%
    ungroup

-output

# A tibble: 9 x 3
#     ID Type       observation
#  <int> <chr>            <int>
#1     1 Basketball           1
#2     1 Baseball             2
#3     2 Basketball           1
#4     2 Football             2
#5     3 Boxing               1
#6     4 Boxing               1
#7     4 Wrestling            2
#8     4 Handball             3
#9     4 Hockey               4

Or use factor

 df %>%
    group_by(ID) %>%
    mutate(observation = as.integer(factor(Type, levels = unique(Type)))) %>%
    ungroup

Or with 1:n()

 df %>%
    group_by(ID) %>%
    mutate(observation = 1:n())

Or using base R

df$observation <- with(df, ave(seq_along(ID), ID, FUN = seq_along))

data

df <- structure(list(ID = c(1L, 1L, 2L, 2L, 3L, 4L, 4L, 4L, 4L), 
Type = c("Basketball", 
"Baseball", "Basketball", "Football", "Boxing", "Boxing", "Wrestling", 
"Handball", "Hockey")), class = "data.frame", row.names = c(NA, 
-9L))
akrun
  • 874,273
  • 37
  • 540
  • 662
2

You can add group row numbers with row_number().

df %>% 
  group_by(ID) %>%
  mutate(observation = row_number())

mutate is the general dplyr function for adding or modifying columns.

Gregor Thomas
  • 136,190
  • 20
  • 167
  • 294