0

I want to read a csv file with 5.000 observations into R Studio. If I set the encoding to UTF-8, only 3.500 observations are imported and I get 2 warning messages:

# Example code
options(encoding = "UTF-8")
df <- read.csv("path/data.csv", dec = ".", sep = ",")

1: invalid input found on input connection

2: EOF within quoted string

According to this thread I was able to find some encodings, which are able to read the whole csv file, e.g. windows-1258. However, with this encoding special characters such as ä, ü or ß are not read properly.

My guess is, that UTF-8 would be the right encoding, but that something is wrong with the character/factor variables of the csv file. For that reason I need a way to read the whole csv file with UTF-8. Any help is highly appreciated!

Joachim Schork
  • 2,025
  • 3
  • 25
  • 48
  • Can you post your code and an example CSV? Are you setting `encoding` to `UTF-8` or `fileEncoding` to `UTF-8` or both? – Dan Jun 01 '17 at 14:42
  • I updated my post with some example code. I also tried `fileEncoding = "UTF-8"`, but that didn't solve my problem. The file is a normal csv file. – Joachim Schork Jun 01 '17 at 14:56
  • 2
    Try to read the file via `data.table::fread("path/data.csv",verbose=T)`. It will be more enlightening – amonk Jun 01 '17 at 15:15
  • Thank you very much @ agerom! That solved the problem. – Joachim Schork Jun 02 '17 at 10:14

0 Answers0