0

I am trying to solve this very trivial question in R but I can't quite figure it out. I have a data frame with only two columns.

> dput(rrr)
structure(list(row = c(1, 2, 3, 4, 1, 2, 4, 6, 1, 3, 4, 5, 1, 
4, 2, 3, 6, 7, 5, 3, 5, 4, 7, 8, 2, 6, 4, 7, 10, 4, 7, 5, 6, 
10, 9, 8, 5, 8, 7, 9, 7, 9, 8, 10, 6, 10, 7, 9), col = c("1", 
"1", "1", "1", "2", "2", "2", "2", "3", "3", "3", "3", "4", "4", 
"4", "4", "4", "4", "4", "5", "5", "5", "5", "5", "6", "6", "6", 
"6", "6", "7", "7", "7", "7", "7", "7", "7", "8", "8", "8", "8", 
"9", "9", "9", "9", "10", "10", "10", "10")), row.names = c(1L, 
2L, 4L, 6L, 16L, 17L, 19L, 21L, 31L, 32L, 34L, 36L, 46L, 47L, 
48L, 50L, 53L, 55L, 57L, 58L, 59L, 60L, 63L, 65L, 70L, 71L, 72L, 
75L, 77L, 82L, 83L, 84L, 86L, 89L, 91L, 93L, 94L, 95L, 96L, 99L, 
109L, 110L, 111L, 115L, 125L, 126L, 127L, 129L), class = "data.frame")

I want to ideally go from this: enter image description here

To this :

enter image description here

Where the numbers in col that match each value in row are written next to it in a new column. Ideally I don't want to write the number in a new column when the number is the same as you can see in the example bellow. I have tried to use reshape, but I am lost because I only have 2 columns and most examples I have seen they got 3 columns. I hope someone can help me. Thanks !

1 Answers1

2

We create a sequence column and do the pivot

library(dplyr)
library(tidyr)
library(data.table)
rrr %>% 
    mutate(rn = rowid(row)) %>%
    pivot_wider(names_from = rn, values_from = col)
# A tibble: 10 x 8
#     row `1`   `2`   `3`   `4`   `5`   `6`   `7`  
#   <dbl> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
# 1     1 1     2     3     4     <NA>  <NA>  <NA> 
# 2     2 1     2     4     6     <NA>  <NA>  <NA> 
# 3     3 1     3     4     5     <NA>  <NA>  <NA> 
# 4     4 1     2     3     4     5     6     7    
# ...

Or with data.table

dcast(setDT(rrr), row ~ rowid(row), value.var = 'col')
akrun
  • 874,273
  • 37
  • 540
  • 662
  • 1
    Perfect! Thanks a lot akrun You are always super helpful! – Amaranta_Remedios Sep 16 '20 at 22:32
  • So the way I dealt with the not writing himself again was to first filter those repetitions out of my data. In case someone has the same question ```a <- subset(rrr, row != col)``` and then followed akrun reply – Amaranta_Remedios Sep 16 '20 at 22:44
  • @Amaranta_Remedios Here the `unique(rrr)` will give the same number of rows as the `nrow(rrr)`. But you want the output to have rows equal to number of unique elements in 'row'. That is the reason I created the sequence based on that column so that the newly created one will be column names for the pivoted one – akrun Sep 16 '20 at 22:48