1

I am having a problem with using read.csv in R. I am trying to import a file that has been saved as a .csv file in Excel. Missing values are blank, but I have a single entry in one column which looks blank, but is in fact a space. Using the standard command that I have been using for similar files produces this error:

raw.data <- read.csv("DATA_FILE.csv", header=TRUE, na.strings="", encoding="latin1")

Error in type.convert(data[[i]], as.is = as.is[i], dec = dec, na.strings = character(0L)) : invalid multibyte string at ' floo'

I have tried a few variations, adding arguments to the read.csv() command such as na.strings=c(""," ") and strip.white=TRUE, but these result in the exact same error.

It is a similar error to what you get when you use the wrong encoding option, but I am pretty sure this shouldn't be a problem here. I have of course tried manually removing the space (in Excel), and this works, but as I'm trying to write generic code for a Shiny tool, this is not really optimal.

rawr
  • 20,481
  • 4
  • 44
  • 78
  • Have you tried using `fileEncoding` instead of `encoding`? As per [this answer](http://stackoverflow.com/questions/14363085/invalid-multibyte-string-in-read-csv). – slamballais Mar 16 '16 at 17:29
  • With these sorts of problems and Excel, the answer is almost always that Excel really has inserted some strange character that you don't want. You just have to find it and remove it. – joran Mar 16 '16 at 17:34
  • Try `na.strings = c('', '\\s')`; there's a chance it's actually a tab or some other whitespace character. Also, `header = TRUE` is the default in `read.csv`, so you don't need to specify. – alistaire Mar 16 '16 at 17:37
  • Otherwise, just clean it after the fact with something like `df[sapply(df, function(x){grepl('^\\s$', x)})] <- NA`, indexing to avoid type conversion by `sub` or `gsub`. – alistaire Mar 16 '16 at 17:53
  • Thanks for the suggestions, I think the key was using fileEncoding, so the full command is now: raw.recruit <- read.csv("DATA_FILE.csv", na.strings="", encoding="latin1", fileEncoding="latin1") – Amy Spencer Mar 30 '16 at 09:53

0 Answers0