1

I have N .tsv files saved in a file named "data" into my rstudio working directory and I want to find a way to import them as separated data frames at once. Below is an example when I try to do it one by one but there are too many of them and I want something faster. Also every time their total number may be different.

#read files into R
f1<-read.table(file = 'a_CompositeSources/In1B1A_WDNdb_DrugTargetInteractions_CompositeDBs_Adhesion.tsv', sep = '\t', header = TRUE)
f2<-read.table(file = 'a_CompositeSources/In1B2A_WDNdb_DrugTargetInteractions_CompositeDBs_Cytochrome.tsv', sep = '\t', header = TRUE)

I have used :

library(readr)
library(dplyr)
files <- list.files(path = "C:/Users/user/Documents/kate/data", pattern = "*.tsv", full.names = T)
tbl <- sapply(files, read_tsv, simplify=FALSE) %>% 
  bind_rows(.id = "id") 


##Read files named xyz1111.csv, xyz2222.csv, etc.
filenames <- list.files(path="C:/Users/user/Documents/kate/data",
                        pattern="*.tsv")

##Create list of data frame names without the ".csv" part 
names <-gsub(".tsv", "", filenames)

###Load all files
for(i in names){
  filepath <- file.path("C:/Users/user/Documents/kate/data",paste(i,".tsv",sep=""))
  assign(i, read.delim(filepath,
                       colClasses=c("factor","character",rep("numeric",2)),
                       sep = "\t"))
}

but only the 1st file is read.

firmo23
  • 7,490
  • 2
  • 38
  • 114

1 Answers1

4

If you have all the .tsv files in one folder and read them into a list using lapply or a for loop:

files_to_read <- list.files(path = "a_CompositeSources/",pattern = "\\.tsv$",full.names = T)
all_files <- lapply(files_to_read,function(x) {
   read.table(file = x, 
              sep = '\t', 
              header = TRUE)
})

If you need to reference the files by name you could do names(all_files) <- files_to_read. You could then go ahead and combine them into one dataframe using bind_rows from the dplyr package or simply work with the list of dataframes.

jludewig
  • 428
  • 2
  • 8
  • how the folder "data" which contains the files is specified here?Also the names will probably be totally different. I was expecting something like a for loop which would read every file from the data folder – firmo23 Jul 28 '19 at 20:06
  • not sure I get you right. but the folder with the .tsv files should be specified as the `path` argument in the `list.files` call (either absolute path or relative to your current working directory). regarding the filenames: with my comment about the `names(all_files) <- files_to_read` you should be able to set the names of the list to the filenames (you may want to tidy up the names a bit by removing the path prefix). – jludewig Jul 29 '19 at 08:19