0

In my dataframe there are multiple rows for a single observation (each referenced by ref). I would like to collapse the rows and create new columns for the keyword column. The outcome would include as many keyowrd colums as the number of rows for an observation (e.g. keyword_1, keyword_2, etc). Do you have any idea? Thanks a lot.

This is my MWE

df1 <- structure(list(rif = c("text10", "text10", "text10", "text11", "text11"), 
                      date = c("20180329", "20180329", "20180329", "20180329", "20180329"), 
                      keyword = c("Lucca", "Piacenza", "Milano", "Cascina", "Padova")), 
                 row.names = c(NA, 5L), class = "data.frame")

Alberto
  • 41
  • 6
  • Maybe you want something like `unstack(df1[c(3,1)])` or `aggregate(keyword ~., df1, list)` ? – GKi Jun 14 '21 at 12:59

1 Answers1

1

Does this work:

library(dplyr)
library(tidyr)
df1 %>% group_by(rif,date) %>% mutate(n = row_number()) %>% pivot_wider(id_cols = c(rif,date), values_from = keyword, names_from = n, names_prefix = 'keyword')
# A tibble: 2 x 5
# Groups:   rif, date [2]
  rif    date     keyword1 keyword2 keyword3
  <chr>  <chr>    <chr>    <chr>    <chr>   
1 text10 20180329 Lucca    Piacenza Milano  
2 text11 20180329 Cascina  Padova   NA      
Karthik S
  • 11,348
  • 2
  • 11
  • 25
  • Thank you for your help! It works perfectly; in my larger datsets do I just add all the other variables in `id_cols`? – Alberto Jun 14 '21 at 13:01
  • 1
    @Edoardo, it's for uniquely identifying rows. If it works for your larger dataset, then go ahead. – Karthik S Jun 14 '21 at 13:05