How can I read a vector of lines (not a file) with fwf into a data frame?
Right now, I can think of two ways, but I really feel that there has to be a better way. Any idea is appreciated.
Use
data.frame()
+substring()
. It does the job, but I am not able to generalize it easily if the data is "ragged" (which it is, by blocks like the one below). I got it from the answer here: Read fixed width text fileUse
write_lines()
andread_fwf()
from readr. I'd like to avoid writing a external file. Actually, it seems thatread_fwf()
should do the work directly on literal data, but I cannot make it work: it keeps understanding the string/vector of lines as a path. Something like:write_lines(literaldata, "fwf_sample.txt") read_fwf("fwf_sample.txt", fwf_widths(rep(8, 12)))
A data sample follows below, with the code that leads to the error.
literaldata <- "CHEXA 278375 2 419991 419976 418527 418528 434131 434116+ 420108 420107
CHEXA 278376 2 420028 420029 419994 419997 434168 434169+ 434134 434137
CHEXA 278377 2 419961 418516 418517 419956 434101 420119+ 420118 434096
CHEXA 278378 2 419965 418519 418520 419967 434105 420116+ 420115 434107
CHEXA 278379 2 419965 419984 420025 419971 434105 434124+ 434165 434111
CHEXA 278380 2 418521 419972 419967 418520 420114 434112+ 434107 420115"
library(readr)
lines<-read_lines(literaldata)
# The code above is just to get a reproducible example similar to the one I get in the data cleaning process
read_fwf(lines, fwf_widths(rep(8, 12)))
Error: 'CHEXA 278375 2 419991 419976 418527 418528 434131
434116+ 420108 420107CHEXA 278376 ...
Thanks in advance