0

I have a column in my dataset which looks like that (not the exact numbers) :

Cost
50
75
$ 1,789,456
$ 1,200,923
690.3490200

The type of this column is character.

In order to do my computations i want to remove the "," the "$" and convert the column to numeric format.

df$cost<-gsub(",","",as.character(df$cost))

This one worked, i have now 1789456 instead of 1,789,456 etc. However, the code for the $ don't work

df$cost<-gsub("$","",as.character(df$cost))

df$cost<-gsub("$ ","",as.character(df$cost))

No error message but here's the output :

Cost
50
75
$ 1789456
$ 1200923
690.3490200

Here's what the dput() gives me :

structure(list(head.df.cost..31. = structure(c(NA, 
NA, NA, NA, NA, NA, NA, NA, 15L, 14L, 14L, 14L, 14L, 14L, 13L, 
4L, 1L, 9L, 12L, 8L, 7L, 10L, 10L, 7L, 2L, 5L, 6L, 6L, 3L, 11L
), .Label = c("$ 1062498", "115.11", "236.49", "275.87", "30", 
"40", "49", "50", "575.64", "60", "631.19200000000001", "75", 
"SPONSORED", "$ 2542196"
"ND", "USD 2300"), class = "factor")), class = "data.frame", row.names = c(NA, 
-30L))
user438383
  • 5,716
  • 8
  • 28
  • 43
katdataecon
  • 185
  • 8

2 Answers2

1

$ represents the end of a line in regex. You need to escape it to use it as a literal. I'm not at a computer, but this should get you want you're looking for:

gsub("[ ,$]+", "", df$cost, perl = TRUE)

This should replace any one or more comma, space, or $. You don't have to escape $ explicitly in square brackets. If you wanted to just replace $s, you could use the pattern "\\$".

ngwalton
  • 383
  • 3
  • 8
1

You can use parse_number from readr:

df = data.frame(cost = c("50", "75", "$ 1,789,456", "$ 1,200,923", "690.3490200"))
df$cost = readr::parse_number(df$cost)

Output:

df

         cost
1      50.000
2      75.000
3 1789456.000
4 1200923.000
5     690.349
bird
  • 2,938
  • 1
  • 6
  • 27