-1

I have tried my best to read a CSV file in r but failed. I have provided a sample of the file in the following Gdrive link.

Data

I found that it is a tab-delimited file by opening in a text editor. The file is read in Excel without issues. But when I try to read it in R using "readr" package or the base r packages, it fails. Not sure why. I have tried different encoding like UTF-8. UTF-16, UTF16LE. Could you please help me to write the correct script to read this file. Currently, I am converting this file to excel as a comma-delimited to read in R. But I am sure there must be something that I am doing wrong. Any help would be appreciated.

Thanks Amal

PS: What I don't understand is how excel is reading the file without any parameters provided? Can we build the same logic in R to read any file?

Amal Joy
  • 41
  • 6

1 Answers1

1

This is a Windows-related encoding problem.

When I open your file in Notepad++ it tells me it is encoded as UCS-2 LE BOM. There is a trick to reading in files with unusual encodings into R. In your case this seems to do the trick:

read.delim(con <- file("temp.csv", encoding = "UCS-2LE"))

(adapted from R: can't read unicode text files even when specifying the encoding).

BTW "CSV" stands for "comma separated values". This file has tab-separated values, so you should give it either a .tsv or .txt suffix, not .csv, to avoid confusion.

In terms of your second question, could we build the same logic in R to guess encoding, delimiters and read in many types of file without us explicitly saying what the encoding and delimiter is - yes, this would certainly be possible. Whether it is desirable I'm not sure.

Peter Ellis
  • 5,694
  • 30
  • 46
  • 1
    Thanks, @peter for your reply. Your solution worked. Perfect. The issue was that the file encoding is UTF-16LE, which read_delim cannot read at present. I used the base read.delim and file() to specify the encoding: `read.delim(file("temp.csv", encoding = "UTF-16LE"), sep = "\t")` This did the trick for me. Thanks again for pointing me to the right direction. – Amal Joy May 11 '20 at 07:40