I'm having trouble reading in data from a file with unusual symbols; there is no error message but it stops once it hits a line with a specific symbol.
temp = read.csv(filePaths[i], header=TRUE, sep="\t", comment.char="#")
The last field which is read in is
Familial Non-VHL Clear Cell Renal Cancer;Birt-Hogg-Dub
Reading the file in Excel, this actually reads:
Familial Non-VHL Clear Cell Renal Cancer;Birt-Hogg-Dub-> Syndrome
but the "->" is a symbol; I believe this actually is "Birt–Hogg–Dubé syndrome", and the last character is probably being interpreted as an EOF char.
I only have this problem on Windows.
I've tried using different encoding (encoding = "UTF-8" and encoding = "bytes", fileEncoding = "UTF-8") without any difference. I've looked at Cannot read unicode .csv into R and searched but can't easily find an answer. Note that I probably can't use a specific language encoding. Thanks!
-- Update -- Created a file with one column, a header, 3 entries (problematic entry at #2), found here: https://www.dropbox.com/s/3m2wak8rhyab6j2/test.txt?dl=0