-1

This is partially related to reading in files in so-called European way, more in How to read in numbers with a comma as decimal separator?. I have data with a row such as "Invoice","1324","Name","John","Age","10","Height","143,5","Products","1;2;3;4","ProductIDs","01;02;03;04" where a comma acts as a separator of field values and inside the field values, delimited with double-quotes, comma acts as a decimal separator.

Inside field values, the semicolon also acts as other separator but we can exclude this observation for now on and concentrate on correctly first reading in the file with commas having different meaning in different places.

How to read in numbers with a comma as a decimal separator and a field separator in R?

hhh
  • 50,788
  • 62
  • 179
  • 282
  • You don't have any double quotes, nor commas within quotes, in your example. – Eric Watt Jun 30 '17 at 22:08
  • @EricWatt fixed the 143.5 to 143,5 (what I meant), thank you for the notice. By double quotes, I mean the quotes "..." as written there. – hhh Jun 30 '17 at 22:10
  • Did you mean like using `sep = ", "` (comma and a space) to read in the fields? – kangaroo_cliff Jun 30 '17 at 22:14
  • @DiscoSuperfly no, I clarified the example, by removal of the spaces (makes the writing a bit less readable but hopefully technically less confusing). – hhh Jun 30 '17 at 22:19
  • 1
    What exactly is your question? `fread('"a","b"\n"1","123,45"', dec = ',')` works as expected for me. – eddi Jun 30 '17 at 22:29
  • Can you change `,` by `.` just in numbers and then read the file in R? – S Rivero Jun 30 '17 at 22:32

2 Answers2

1

It might be possible to do using the dec parameter depending on how you're reading the file in. Here is how I would do it using data.table:

dat <- fread('"Name", "Age"
              "Joe", "1,2"')
dat[, Age := as.numeric(gsub(",", ".", Age))]

#    Name Age
# 1:  Joe 1.2
Eric Watt
  • 3,180
  • 9
  • 21
  • I think this could work in some cases +1, like the Height field values, but not in cases such as web addresses in the column values (where dot does not specify a decimal separation). How would you originally read in the values in a more general case? `fread(data.csv, colClasses = list(character=1:31))`? – hhh Jun 30 '17 at 22:27
  • 1
    fread usually does a pretty good job on it's own, so i start with fread(data.csv). It converted "1,2" to a character on it's own. Sometimes there may be issues, so you can specify the column class if necessary. If it's a web address, you would just leave it as a character and not do anything to that column. – Eric Watt Jun 30 '17 at 22:31
  • I am inclined to the solution that I will read everything in as characters and later modify things like this here, thank you for helping. I will accept this until someone comes up with more clever alternative. – hhh Jun 30 '17 at 22:42
0

How about this?

read.table("file.name", sep=",", quote = "\"", dec=",")
kangaroo_cliff
  • 6,067
  • 3
  • 29
  • 42
  • When I try this with fread, I still get the error `"The two arguments to fread 'dec' and 'sep' are equal (',')."`. data.table may work but `library(data.table); fread("file.name", ...)` not. – hhh Jun 30 '17 at 22:36
  • I actually didn't see the tag for `data.table`. Apologies. With data.table, just using the `fread("file.name")` worked for me, as others mentioned. – kangaroo_cliff Jun 30 '17 at 22:40