0

I have a csv file with 4 columns: enter image description here

I called ses<-read.csv("ses_data",header = TRUE) to read a csv file in Rstudio. However, the name of the first column has changed (shown below):

> names(ses)
[1] "ï..ncdsid" "n1685"     "n1225"     "n1230"   

It is strange that it only occurs in Window, while it is not reproducible in MacOS. Does anyone has any ideas what is happening?

Thank you in advance.

karyn-h
  • 133
  • 7
  • Does it still occur if you use `read.csv(..., check.names = FALSE)` – G. Grothendieck Jun 21 '21 at 18:54
  • When I added `check.name=FALSE`, it becomes to ` > names(ses) [1] "ncdsid" "n1685" "n1225" "n1230" ` – karyn-h Jun 21 '21 at 18:57
  • Based off this post those look like they could be the UTF-8 byte order characters: https://stackoverflow.com/a/10786781/1581658 Not sure why they show up in Windows and not MacOS (did you open it and close in Excel possibly?) You can open in notepad and delete the charachters or get read.csv to use UTF-8 encoding (not sure if it takes an encoding parameter - I don't know R so well) – SamBob Jun 21 '21 at 18:57
  • I think read.table can take an encoding="UTF-8" parameter but not read.csv as that expects a raw file. So you;d need the full read.table function call instead of read.csv – SamBob Jun 21 '21 at 19:00
  • 1
    That is the Byte Order Mark. See https://stackoverflow.com/questions/3255993/how-do-i-remove-%C3%AF-from-the-beginning-of-a-file – G. Grothendieck Jun 21 '21 at 19:00
  • For R to handle its presence i think you might need: ses<-read.table("ses_data",header = TRUE, sep=",", encoding="UTF-8") – SamBob Jun 21 '21 at 19:02
  • 3
    Try `read.csv(..., fileEncoding = "UTF-8-BOM")` – G. Grothendieck Jun 21 '21 at 19:02
  • @G.Grothendieck Thank you, it is working now! – karyn-h Jun 21 '21 at 19:04

0 Answers0