1

I have a few columns where the value is for example : 525K or 1.1M. I want to convert those values to thousand or millions as numerics without using an extra R package besides baser and tidyr.

enter image description here

Is there anyone who can help me with a code or a function how I can do this in a simple and quick way?

I have tried to do it by hand with removing the 'M' or 'K' and the '.'.

players_set$Value <- gsub(pattern = "M", replacement = "000000 ", 
                           x = players_set$Value, fixed = TRUE)
Rui Barradas
  • 70,273
  • 8
  • 34
  • 66
Swinks95
  • 21
  • 1
  • 6
  • oaky, what is the issue with `gsub` – akrun Jun 24 '19 at 15:28
  • Several previous questions on this general issue [here](https://stackoverflow.com/q/45972571/324364). – joran Jun 24 '19 at 15:32
  • The issue is that some values have 'M' and some have 'K' and other also have a '.' in their value, so it is difficult to combine it all together to make the code work – Swinks95 Jun 24 '19 at 15:32

2 Answers2

2

For a base R option, we can try using sub to generate an arithmetic expression, based on the K or M unit. Then, use eval with parse to get the final number:

getValue <- function(input) {
    output <- sub("M", "*1000000", sub("K", "*1000", input))
    eval(parse(text=output))
}

getValue("525K")
getValue("1.1M")

[1] 525000
[1] 1100000
Tim Biegeleisen
  • 502,043
  • 27
  • 286
  • 360
  • What is the safe alternative to `eval(parse(....))`? I know there are several questions on the same but it seems there really isn't a safe alternative. – NelsonGon Jun 24 '19 at 16:02
  • 1
    @NelsonGon Are you asking regarding the possibility that the input text might not be a valid arithmetic expression, or something else? – Tim Biegeleisen Jun 24 '19 at 16:05
  • Not really, I have read somewhere on some old post that `eval(parse(..))` is dangerous when used for instance in a security conscious program. I've also read the same for python's `eval(repr())` although I really don't know an R alternative. Like [here](https://stackoverflow.com/questions/13649979/what-specifically-are-the-dangers-of-evalparse). – NelsonGon Jun 24 '19 at 16:09
1

Here is another option with a named vector matching

getValue <- function(input) {
    # remove characters except LETTERS 
    v1 <- gsub("[0-9.€]+", "", input)
    # remove characters except digits
    v2 <- gsub("[A-Za-z€]+", "", input)
    # create a named vector
    keyval <- setNames(c(1e6, 1e3), c("M", "K"))
    # match the LETTERS (v1) with the keyval to get the numeric value
    # multiply with v2
    unname(as.numeric(v2) *keyval[v1])
}



getValue("525K")
#[1] 525000
getValue("1.1M")
#[1] 1100000

getValue("€525K")
#[1] 525000

getValue("€1.1M")
#[1] 1100000
akrun
  • 874,273
  • 37
  • 540
  • 662