39

I need to read a table that is a .tsv file in R.

enter image description here

test <- read.table(file='drug_info.tsv')
# Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings,  : 
#   line 1 did not have 10 elements
test <- read.table(file='drug_info.tsv', )
# Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings,  : 
#   line 1 did not have 10 elements
scan("drug_info.tsv")
# Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings,  : 
#   scan() expected 'a real', got 'ChallengeName'
scan(file = "drug_info.tsv")
# Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings,  : 
#   scan() expected 'a real', got 'ChallengeName'

How should I read it?

Prradep
  • 5,506
  • 5
  • 43
  • 84
Andrew Voronkov
  • 491
  • 1
  • 4
  • 3
  • 2
    Please copy/paste the first 5 rows of the file into your question and remove the picture. – Rich Scriven Oct 24 '15 at 19:53
  • 3
    pretty much `read.delim` with the default settings – rawr Oct 24 '15 at 20:15
  • 2
    `read.table` default to using a whitespace delimited (meaning space or tab generally). If you have spaces, you can explicitly set the delimiter as tab with `sep="\t"`. `read.table` works with valid input files, so if there is a problem importing your data, it's with the file, and not the function. So in order to help you, we'd need you to share a sample of the file you are actually trying to import, not a picture of the data in some other program. – MrFlick Oct 24 '15 at 20:23

6 Answers6

44

This should do it:

read.table(file = 'drug_info.tsv', sep = '\t', header = TRUE)
Robert
  • 2,111
  • 4
  • 18
  • 32
15

Using fread from the package data.table will read the data and will skip the error you are getting using read.table.

require(data.table)

data<-as.data.frame(fread("drug_info.tsv"))
Pang
  • 9,564
  • 146
  • 81
  • 122
TBhavnani
  • 721
  • 7
  • 12
13

You can treat the data like a csv, and specify tab delimination.

read.csv("drug_info.tsv", sep = "\t")
Sam Old
  • 142
  • 1
  • 7
5

Assuming that only the first line does not have the right number of elements, and that this is the column names line. Skip the first line:

 d <- read.table('drug_info.tsv', skip=1)

Now read it

 first <- readLines('drug_info.tsv', n=1)

Inspect it, fix it such that its number of elements matches d and then

 colnames(d) <- first

If that does not work, you can do

 x <- readLines('drug_info.tsv')  

and diagnostics like this:

 sapply(x, length)
Robert Hijmans
  • 40,301
  • 4
  • 55
  • 63
5

You need to include fill = TRUE.

test <- read.table(file='drug_info.tsv', sep = '\t', header = TRUE, fill = TRUE)
woutcault
  • 51
  • 1
  • 2
4

utils::read.delim() is most commonly used in such case if you don't want to install other library. The sample code could be something like:

test <- read.delim(file='drug_info.tsv')

or much more friendly io functions could be available from readr library, where a read_tsv named function is available directly:

test <- readr::read_tsv('drug_info.tsv')
千木郷
  • 1,595
  • 2
  • 19
  • 30