Reading a file from a position in R

Question

I have a large plain text file to be read in R, where all data is contained at the same line with no spaces (DNA sequence with no header). I found the next function:

readChar("filename",nchar=n)

which allows to read just the "n" first elements of the file saving a lot of time. Is there another function in R that goes further by reading just from START position to STOP one, avoiding to upload the whole file?

score 1 · Accepted Answer · answered Oct 21 '20 at 18:12

1

Basically no, from what i know, you need to read the whole file and then discard the characters that you don't want. For example, if you want only the first 10 letters for every line:

strsub(readChar("filename",nchar=n),1,10)

But, this post (How to efficiently read the first character from each line of a text file?) shows some ways of improving the efficiency of that.

answered Oct 21 '20 at 18:12

Ricardo Semião e Castro

4,366
1
8
27

1

Thank you Ricardo, I did not find this post, It was what I was looking for but, unfortunately, It seems not be possible reading a file from a no start position. Anyway, readChar instead of scan, improves the execution time a lot. On the other hand, I do not find any differences between stri_sub from stringi and substring from base for large files reading. Thanks again! – Tomás Navarro Oct 22 '20 at 10:01

score 1 · Answer 2 · answered Nov 27 '22 at 17:05

1

You have to create a connexion, then use the seek function. Do not forget to close the connexion after.

For example, this will read 100 characters from position 1000.

cx <- file("filename", "rb")
seek(cx, 1000)
d <- readChar(cx, nchar=100)
close(cx)

answered Nov 27 '22 at 17:05

Sci Prog

2,651
1
10
18

Reading a file from a position in R

2 Answers2