2

I want to read in a csv file with numbers such as 123.456 where the . is a thousands separator. However, R treats the . as a decimal separator and instead of 123456 it gives me 123.456. Is there an easier solution than the ones outlined in R - read csv with numeric columns containing thousands separator?

write("123.456", "number.csv")
read.csv("number.csv", header = F, stringsAsFactors = F)
Joe
  • 1,628
  • 3
  • 25
  • 39
  • 1
    maybe use `x$V1 = as.numeric(gsub('\\.','',x$V1))`. If your data also have another character that acts as a decimal separator (perhpas a comma?) then you would also need to change those – dww Apr 09 '18 at 15:39
  • The `read.csv2` variant used in countries that use a comma as decimal point and a semicolon as field separator. – Dave2e Apr 09 '18 at 16:30

1 Answers1

4

The world is full of crazy people from crazy places who write numbers in crazy ways. In computing all this craziness is called "locale".

The readr package allows you to specify locale options. You are interested in what is variously called the "grouping mark" or "thousands separator".

For example:

library(readr)
read_csv('x,y\n1.234,5.678') # Default is not what you want
#> # A tibble: 1 x 2
#>       x     y
#>   <dbl> <dbl>
#> 1  1.23  5.68

# This is what you want
read_csv('x,y\n1.234,5.678', locale = locale(grouping_mark = "."))
#> # A tibble: 1 x 2
#>       x     y
#>   <dbl> <dbl>
#> 1  1234  5678
ngm
  • 2,539
  • 8
  • 18