0

I'm trying to fix a dataset that has some errors of decimal numbers wrongly typed. For example, some entries were typed as ".15" instead of "0.15". Currently this column is chr but later I need to convert it to numeric.

I'm trying to select all of those "words" that start with a period "." and replace the period with "0." but it seems that the "^" used to anchor the start of the string doesn't work nicely with the period.

I tried with:

dataIMN$precip <- str_replace (dataIMN$precip, "^.", "0.")

But it puts a 0 at the beginning of all the entries, including the ones that are correctly typed (those that don't start with a period).

lmo
  • 37,904
  • 9
  • 56
  • 69
Guillermo.D
  • 399
  • 2
  • 14
  • 5
    Just do `as.numeric(".15")` (OR `as.numeric(dataIMN$precip)`) and it will convert to numeric and add a `0` if necessary – d.b Sep 02 '17 at 21:27
  • 2
    You need `"^\\."` – G5W Sep 02 '17 at 21:30
  • 2
    Also, the correct regular expression would be `"^\\."`, you need to escape the special character. – Rui Barradas Sep 02 '17 at 21:31
  • 2
    d.b's suggestion is the best. Convert it directly. The reason that your conversion failed is that "." is a special character in regex, the "wildcard" that matches any single character (much like "?" in some MS products). As the recent comments suggest, if you were to perform this action, which you shouldn't as it's unnecessary, you need to escape the dot to match on the literal rather than on the special character. – lmo Sep 02 '17 at 21:33

1 Answers1

1

If you need to do as you've stated, brackets [] are regex for 'find exact', or you can use '\\' which escapes a character, such as a period:

Option 1:

gsub("^[.]","0.",".54")
[1] "0.54"

Option 2:

gsub("^\\.","0.",".54")
[1] "0.54"

Otherwise, as.numeric should also take care of it automatically.

www
  • 4,124
  • 1
  • 11
  • 22