I'm trying to read a .csv file into R. The .csv file was created in Excel, and it contains "long" dashes, which are the result of Excel "auto-correcting" the sequence space-dash-space. Sample entries that contain these "long" dashes:
US – California – LA
US – Washington – Seattle
I've experimented with different encoding, including the following three options:
x <- read.csv(filename, encoding="windows-1252") # Motivated by http://www.perlmonks.org/?node_id=551123
x <- read.csv(filename, encoding="latin1")
x <- read.csv(filename, encoding="UFT-8")
But, the long dashes either show up as � (first and second option) or as <U+0096>
(third option).
I realize that I can store the file in different formats or use different software (Excel to CSV with UTF8 encoding) but that's not the point.
Has anyone figured out what encoding option in R works in such cases?