0

I have 200 tsv documents to import

I use code

list_file <- list.files(pattern = "*.tsv") read.all <- lapply(list_file, read.table,header=TRUE, sep="\t",na.strings = c(""," "), fill = TRUE) It provides warning as EOF within quoted string to import

At first I tought it is warning so I moved on, the structure of data is find. But when I check the number of rows in last few lists, I found it didn't import all rows from those tsv documents. It seems nothing wrong with my data. Because I imported a single tsv file that imported incompletely with lapply function and it was imported successfully without losing any information

Since I can't monitor which file was not import appropriately, and I can't trust I have all information I need. Can any one help to provide suggestions?

Maybe provide a method that may slow but without errors? Many thanks.

  • 1
    If you get the error "EOF within quoted string to import" then you probably have a mismatched quote in your file making the file invalid. Check for improperly formatted data in your input files. It seems unlikely this would be about speed. It's hard to say what's going on without a [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example). It would be very unlikely that read.table would perform differently in `lapply` vs calling it directly. – MrFlick Mar 14 '19 at 21:44
  • @ MrFlick Thanks, my worries are I don't know how many files are not imported successfully and I don't know how to check. Do you know any command would monitor the importing process and report detailed warning information? Like which file has problems? or which line? – drexel star Mar 15 '19 at 03:01
  • Read.table doesn’t really report that. And files could have embedded new lines so it’s not easy to tell how many lines an ill formed file had. If you don’t expect any embedded new lines, then you can use readLines() to see how many rows there should be and compare. – MrFlick Mar 15 '19 at 03:04

0 Answers0