1

I have a dataframe of two columns id and result, and I want to assign factor levels to result depending on id. So that for id "1", result c("a","b","c","d") will have factor levels 1,2,3,4. For id "2", result c("22","23","24") will have factor levels 1,2,3.

id <- c(1,1,1,1,2,2,2)
result <- c("a","b","c","d","22","23","24")

I tried to group them by split, but they will be converted to a list instead of a data frame, which causes a length problem for modeling. Can you help please?

Rui Barradas
  • 70,273
  • 8
  • 34
  • 66
y.eska
  • 67
  • 6
  • Assuming you have dataframe `df <- data.frame(id, result)`, using `dplyr`, you can do `df %>% group_by(id) %>% mutate(row = row_number())` – Ronak Shah Jan 20 '20 at 12:24
  • @RonakShah I don't believe this is a dupe of that one, you will have to convert `df %>% etc %>% mutate(fac = factor(row)) %>% select(-row)`, where `etc` is your code. – Rui Barradas Jan 20 '20 at 12:26
  • @RuiBarradas-ReinstateMonic Sure, feel free to reopen if you disagree. – Ronak Shah Jan 20 '20 at 12:29

2 Answers2

2

Though the question was closed as a duplicate by user @Ronak Shah, I don't believe it is the same question.

After numbering the row by group the new column must be coerced to class "factor".

library(dplyr)

id <- c(1,1,1,1,2,2,2)
result <- c("a","b","c","d","22","23","24")

df <- data.frame(id, result)

df %>%
  group_by(id) %>%
  mutate(fac = row_number()) %>%
  ungroup() %>%
  mutate(fac = factor(fac))
# A tibble: 7 x 3
#     id result fac  
#  <dbl> <fct>  <fct>
#1     1 a      1    
#2     1 b      2    
#3     1 c      3    
#4     1 d      4    
#5     2 22     1    
#6     2 23     2    
#7     2 24     3    

Edit.

If there are repeated values in result, coerce as.integer/factor to get numbers, then coerce those numbers to factor.

id2 <- c(1,1,1,1,2,2,2,2)
result2 <- c("a","b","c","d","22", "22","23","24")

df2 <- data.frame(id = id2, result = result2)

df2 %>%
  group_by(id) %>%
  mutate(fac = as.integer(factor(result))) %>%
  ungroup() %>%
  mutate(fac = factor(fac))
# A tibble: 8 x 3
#     id result fac  
#  <dbl> <fct>  <fct>
#1     1 a      1    
#2     1 b      2    
#3     1 c      3    
#4     1 d      4    
#5     2 22     1    
#6     2 22     1    
#7     2 23     2    
#8     2 24     3    
Rui Barradas
  • 70,273
  • 8
  • 34
  • 66
  • Thank you so much! it worked :D How can I assign factor levels to unique values of result? For example if result<- "22",22",23","24". It should have factor levels 1,1,2,3 instead of 1,2,3,4 – y.eska Jan 20 '20 at 12:54
  • @y.eska This means that the new column will have for `id == 1, result == "a"` the same factor level (value) as `id == 2, result == "22"`. – Rui Barradas Jan 20 '20 at 12:57
0

After grouping by id, we can use match with unique to assign unique number to each result. Using @Rui Barradas' dataframe df2

library(dplyr)

df2 %>%
  group_by(id) %>%
  mutate(ans = match(result, unique(result))) %>%
  ungroup %>%
  mutate(ans = factor(ans))

#     id result ans  
#  <dbl> <fct>  <fct>
#1     1 a      1    
#2     1 b      2    
#3     1 c      3    
#4     1 d      4    
#5     2 22     1    
#6     2 22     1    
#7     2 23     2    
#8     2 24     3    
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213