0

I have a large table with unique characteristics that occur on multiple IDs (table A). Is there a clever workaround in which I could horizontally consolidate the values so that in the second table B I have unique IDs in the rows and in the columns occurring characteristics (which also occur in different numbers per ID)? The fields for missing features in an ID row I want to fill with NA. Since I have a maximum of 22 unique characteristics per ID, the maximum number of columns should be 23 (with ID).

With the loop it works, but it takes forever.

I tried all solutions from How to reshape data from long to wide format without success.

E.g., for reshape, cast, dcast, and other functions the vector is too large giving: Error: cannot allocate vector of size ...

Vertical Table Horizontal Table

user16217248
  • 3,119
  • 19
  • 19
  • 37
freeflight
  • 29
  • 4

1 Answers1

1

If you create a new column in Table A then you can use pivot_wider quite easily:

library(tidyverse)

table_a <- tibble(
  id = c(1, 1, 2, 2, 2, 2, 3, 3, 3), 
  feature = c("df", "ftv", "ed", "wed", "rfc", "dtb", "bes", "xrd", "yws")
)

table_b <- table_a %>%
  group_by(id) %>%
  mutate(feature_name = paste0("feature", row_number())) %>%
  pivot_wider(names_from = feature_name, values_from = feature)
  
table_b
# A tibble: 3 × 5
# Groups:   id [3]
     id feature1 feature2 feature3 feature4
  <dbl> <chr>    <chr>    <chr>    <chr>   
1     1 df       ftv      NA       NA      
2     2 ed       wed      rfc      dtb     
3     3 bes      xrd      yws      NA     
Harrison Jones
  • 2,256
  • 5
  • 27
  • 34