0

I have several files ending in *.var and I want to combine then

For this, I used the package purrr

filelist = list.files(pattern = "*.var$") #make the file list 
df = filelist %>%
  set_names() %>% 
  map_dfr(
    ~ read_csv(.x, col_types = cols(), col_names = FALSE),
    .id = "file_name"
  )

That seems to give me the desired output

# A tibble: 6 x 3
  file_name X1           X2   
  <chr>     <chr>        <chr>
1 CV.var    Chrom_3_793  T    
2 CV.var    Chrom_3_4061 G    
3 CV.var    Chrom_3_4034 G    
4 CV.var    Chrom_3_4035 A    
5 GK.var    Chrom_3_4061 T    
6 CV.var    Chrom_3_4064 T  

But now I would like to transform this table into a table with boolean values. Basically, I want the column 1 values (there are 4 in total) to become column entries. And the first 2 columns would be the columns X1 and X2 So that I could know if

Chrom_3_4061 T is in 1, 2, 3, or 4 of my sets, for example:

            CV.var GK.var DP.var SK.var  
Chrom_3_4061 G 1       0     1       1

That should be a question of transposing and cutting pasting, what is the most efficient way of doing it, I feel a bit lost with the different packages and approaches.

Thanks a lot.

zx8754
  • 52,746
  • 12
  • 114
  • 209
Axzd
  • 45
  • 5
  • Nothing here suggests use of the `data.table` package (other than the [tag:data.table] *tag*), and all of your code uses tidyverse functions. Can you clarify, please? – r2evans Jul 21 '21 at 13:10
  • 1
    I could use any solutions, I am not tied to tidyverse or data.table. – Axzd Jul 21 '21 at 13:15
  • See the linked post, it has many solutions including tidy and datatable. – zx8754 Jul 21 '21 at 13:45

1 Answers1

3

You could use pivot_wider:

library(tidyr)

df %>% 
  mutate(value = TRUE) %>% 
  pivot_wider(names_from = file_name, values_fill = FALSE)

I filled it with booleans instead of 0 and 1.

Martin Gal
  • 16,640
  • 5
  • 21
  • 39