R dplyr pivot wider with duplicates and generate variable names

Question

How can I go from

df<-data.frame(id=c("A", "B", "B"), res=c("one", "two", "three"))
df

to

df.output<-data.frame(id=c("A", "B"), res1=c("one", "two"), res2=c(NA, "three"))
df.output

with dplyr?

I do not know a priori the number of duplicates in id (in this example B has 2 occurrences), so the number of resX variables in the output data frame has to be generated on the fly.

score 3 · Accepted Answer · answered Mar 07 '22 at 10:17

You just need to create a row identifier, which you can do with dplyr and then use tidyr::pivot_wider() to generate all your resX variables.

library(dplyr)
library(tidyr)

df %>%
  group_by(id) %>%
  mutate(
    no = row_number()
  ) %>%
  ungroup() %>%
  pivot_wider(
    id,
    names_from = no,
    names_prefix = "res",
    values_from = res
  )
#> # A tibble: 2 × 3
#>   id    res1  res2 
#>   <chr> <chr> <chr>
#> 1 A     one   <NA> 
#> 2 B     two   three

score 3 · Answer 2 · answered Mar 07 '22 at 10:19

3

Using data.table::dcast:

library(data.table)
dcast(setDT(df), id ~ rowid(id, prefix = "res"), value.var = "res")

   id res1  res2
1:  A  one  <NA>
2:  B  two three

answered Mar 07 '22 at 10:19

Maël

45,206
3
29
67

score 3 · Answer 3 · answered Mar 07 '22 at 10:20

3

A base R option with reshape + ave

reshape(
  transform(df, q = ave(id, id, FUN = seq_along)),
  direction = "wide",
  idvar = "id",
  timevar = "q"
)

gives

  id res.1 res.2
1  A   one  <NA>
2  B   two three

answered Mar 07 '22 at 10:20

ThomasIsCoding

96,636
9
24
81

score 2 · Answer 4 · answered Mar 07 '22 at 10:52

A way with pivot_wider. Before we have to wrangle the data:

library(dplyr)
library(tidyr)

df %>% 
  group_by(id) %>% 
  mutate(names = paste0("res", row_number())) %>% 
  pivot_wider(
    names_from = names,
    values_from = res,
  )

  id    res1  res2 
  <chr> <chr> <chr>
1 A     one   NA   
2 B     two   three

R dplyr pivot wider with duplicates and generate variable names

4 Answers4