16

I have a file like this:

mylist.txt
234984  10354  41175 932711 426928
1693237   13462

Each line of this file has different number of elements, minimum of 1 element per line. I would like to read it into a list like this:

> print(head(mylist,2))
[[1]]
[1] 234984  10354  41175 932711 426928

[[2]]
[1] 1693237   13462
zx8754
  • 52,746
  • 12
  • 114
  • 209
pms
  • 4,508
  • 4
  • 25
  • 30
  • Since your example list items show spaces preserved between the numbers, it's not clear if you want each line to be a long string, or a vector of numbers. – J. Win. Jan 30 '11 at 18:02
  • Vector of numbers. I'm not sure why it show spaces. Anyways, the aL3xa answer seems to work pretty well. – pms Feb 02 '11 at 11:25

3 Answers3

20

Assuming that space is delimiter:

fc <- file("mylist.txt")
mylist <- strsplit(readLines(fc), " ")
close(fc)

EDIT:

If the values are delimited by several spaces (an/or in unconsistent way), you can match delimiter with regular expression:

mylist.txt
234984   10354   41175 932711      426928
1693237               13462

fc <- file("mylist.txt")
mylist <- strsplit(readLines(fc), " +")
close(fc)

EDIT #2

And since strsplit returns strings, you need to convert your data to numeric (that's an easy one):

mylist <- lapply(mylist, as.numeric)
aL3xa
  • 35,415
  • 18
  • 79
  • 112
2

A possible answer is to first read a list filled with NAs and then removing them like this:

l<-as.list( as.data.frame( t(read.table("mylist.txt",fill=TRUE,col.names=1:max(count.fields("mylist.txt"))))) )
l<-lapply(l, function(x) x[!is.na(x)] )

I wonder if there is a simpler way of doing it.

pms
  • 4,508
  • 4
  • 25
  • 30
1

You could simplify the second line by using lapply instead of sapply

    lapply(l, function(x)x[!is.na(x)])
csgillespie
  • 59,189
  • 14
  • 150
  • 185
  • 1. You need it, otherwise read.table takes as a number of columns to read the maximum number of columns in first 5 lines of the file – pms Jan 30 '11 at 14:21
  • @pms Ahh, my test file had the maximum number of columns in the first line. I've updated my answer. – csgillespie Jan 30 '11 at 14:24