-1

Reading CSV data with R, in column 11 have data in the formats "1,022.00" and "516.00" and they must be "numerics" or "double"

dados201702 <- read.csv("dataset.csv", 
                        header = TRUE, 
                        sep = "\t", 
                        dec = ".",
                        colClasses = c("character", 
                                       "character", 
                                       "character", 
                                       "character", 
                                       "character", 
                                       "character", 
                                       "character", 
                                       "character", 
                                       "character", 
                                       "character", 
                                       "numeric",
                                       "character"))

I want to import column 11 as numeric or double however error occurs:

Error in scan(file = file, what = what, sep = sep, quote = quote, dec = dec, scan() expected 'a real', got '1,022.00'

  • Show us some example data to work with – ekstroem Apr 21 '17 at 22:11
  • Please clarify what you are are trying to ask. You may want to check out the [asking guidelines](http://stackoverflow.com/help/how-to-ask) and specifically the section on [minimal, complete, and verifiable examples](http://stackoverflow.com/help/mcve). – Luke C Apr 21 '17 at 22:11
  • I want to import column 11 as numeric or double however error occurs: "Error in scan(file = file, what = what, sep = sep, quote = quote, dec = dec, scan() expected 'a real', got '1,022.00'" – Leonardo Gregianin Apr 21 '17 at 22:28
  • Would there be a large downside to reading that column as character then converting that column to numeric in the same step? i.e. `dados201702$column11 %<>% as.numeric` – cgage1 Apr 21 '17 at 22:37

1 Answers1

2

It looks as if your data contains a 1000s separator where the , is giving you problems. You can either read in the data.frame and convert the relevant columns using gsub or you can define a new class definition as suggested in one of the following links:

Here we define a new class that removes the commas (the 1000 separator).

setClass("MyNum")
setAs("character", "MyNum", 
      function(from) as.numeric(gsub(",", "", from) ))
indata <- read.csv("tst.txt", , 
                   header = TRUE, 
                   sep = "\t", 
                   dec = ".", 
                   colClasses=c(rep("character", 10), "MyNum", "character"))

Alternatively just use as.numeric(gsub(",", "", from) ) where from is the vector containing the 1000s separator.

ekstroem
  • 5,957
  • 3
  • 22
  • 48