How to efficiently transpose data frames with the tidyverse or data.table?

Question

I have several files ending in *.var and I want to combine then

For this, I used the package purrr

filelist = list.files(pattern = "*.var$") #make the file list 
df = filelist %>%
  set_names() %>% 
  map_dfr(
    ~ read_csv(.x, col_types = cols(), col_names = FALSE),
    .id = "file_name"
  )

That seems to give me the desired output

# A tibble: 6 x 3
  file_name X1           X2   
  <chr>     <chr>        <chr>
1 CV.var    Chrom_3_793  T    
2 CV.var    Chrom_3_4061 G    
3 CV.var    Chrom_3_4034 G    
4 CV.var    Chrom_3_4035 A    
5 GK.var    Chrom_3_4061 T    
6 CV.var    Chrom_3_4064 T

But now I would like to transform this table into a table with boolean values. Basically, I want the column 1 values (there are 4 in total) to become column entries. And the first 2 columns would be the columns X1 and X2 So that I could know if

Chrom_3_4061 T is in 1, 2, 3, or 4 of my sets, for example:

            CV.var GK.var DP.var SK.var  
Chrom_3_4061 G 1       0     1       1

That should be a question of transposing and cutting pasting, what is the most efficient way of doing it, I feel a bit lost with the different packages and approaches.

Thanks a lot.

Nothing here suggests use of the `data.table` package (other than the [tag:data.table] *tag*), and all of your code uses tidyverse functions. Can you clarify, please? — r2evans, Jul 21 '21 at 13:10
I could use any solutions, I am not tied to tidyverse or data.table. — Axzd, Jul 21 '21 at 13:15
See the linked post, it has many solutions including tidy and datatable. — zx8754, Jul 21 '21 at 13:45

score 3 · Answer 1 · answered Jul 21 '21 at 13:42

3

You could use pivot_wider:

library(tidyr)

df %>% 
  mutate(value = TRUE) %>% 
  pivot_wider(names_from = file_name, values_fill = FALSE)

I filled it with booleans instead of 0 and 1.

answered Jul 21 '21 at 13:42

Martin Gal

16,640
5
21
39

How to efficiently transpose data frames with the tidyverse or data.table?

1 Answers1