1

I'm trying to read in a .csv file using readr::read_csv

readr::read_csv("my_file.csv")

But I got the following error:

Parsed with column specification:
cols(
  col_character()
)
Error in read_tokens_(data, tokenizer, col_specs, col_names, locale_,  : 
  Evaluation error: Column 1 must be named.

What is going on exactly?

The .csv file can be found here: https://drive.google.com/file/d/1W_ZetpOfWDuSVhiIVAa0sEcRE4ujCSXB/view?usp=sharing

divibisan
  • 11,659
  • 11
  • 40
  • 58
  • @Gautam Yes, originally I thought the same, but the linked post is about `read.csv` not `readr::read_csv`. – zx8754 Aug 21 '18 at 20:03

1 Answers1

1

The issue is the encoding and this post shows how it can be done using read.csv:

read.csv("BRA_females-45q15.csv", fileEncoding="UTF-16LE")

To achieve the same using readr::read_csv we can do as below, first we can find out the encoding:

guess_encoding(file = "BRA_females-45q15.csv")
# # A tibble: 3 x 2
#   encoding   confidence
#   <chr>           <dbl>
#   1 UTF-16LE         1   
# 2 ISO-8859-1       0.8 
# 3 ISO-8859-2       0.51

Then use read_csv with locale:

read_csv("BRA_females-45q15.csv", locale = locale(encoding = "UTF-16LE"))

# Error in guess_header_(datasource, tokenizer, locale) : 
#   Incomplete multibyte sequence

But this again give us an error, and looks like a know issue.

Hadley: "Yeah, this is a big issue that will need some thought. In general, readr currently assumes that it can read byte-by-byte, and anything else will require quite of lot of work/thought."

zx8754
  • 52,746
  • 12
  • 114
  • 209