I'm reading in many large tab-separated .txt
files using read.table
in R. However, some lines contain newline breaks (\n
) where there should be tabs (\t
), which causes an Error in scan(...)
. How can I deal with this issue robustly? (Is there a way to replace \n
-->\t
every time scan
encounters an error?)
Edit:
Here's a simple example:
read.table(text='a1\tb1\tc1\td1\n
a2\tb2\tc2\td2', sep='\t')
works fine, and returns a data frame. However, suppose there is, by some mistake, a newline \n
where there should be a tab \t
(e.g., after c1
):
read.table(text='a1\tb1\tc1\nd1\n
a2\tb2\tc2\td2', sep='\t')
This raises an error:
Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, :
line 1 did not have 4 elements
Note: Using fill=T
won't help, because it will push d1
to a new row.