3

I would like to replace k in a string with 000. For instance, I want to make "£50000" from "£50k". Note that the function can be applied to cases like "£50k king", which should result in "£50000 king".

Here's what I have so far:

replace_k = function(data){
data = gsub("^[0-9]k", "[0-9]000", data)
return(data)
} 
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213
Misha
  • 163
  • 1
  • 1
  • 12
  • Somewhat related: [Changing Million/Billion abbreviations into actual numbers? ie. 5.12M -> 5,120,000](https://stackoverflow.com/questions/45972571/changing-million-billion-abbreviations-into-actual-numbers-ie-5-12m-5-120-0) – Henrik Jun 17 '18 at 16:24

2 Answers2

6

How about

data = gsub("([0-9]+)k", "\\1000", data)
PhilMasteG
  • 3,095
  • 1
  • 20
  • 27
  • Thanks, it worked! I also tried to do the same with "£1.5 mil" with `gsub("([0-9]+) mil", "\\1000000", data)` but it replaces to "£1.5000000". Any suggestions? – Misha Jun 17 '18 at 15:27
  • You'd have to use `gsub("([0-9]+).([0-9]) mil", "\\1\\200000", data)`, but that only covers millions with exactly one decimal. If you have more possible cases you have to determine the right pattern to use, maybe parse the number first, multiply by the suffix and replace then. – PhilMasteG Jun 17 '18 at 15:30
  • Thank you! Helped a lot! – Misha Jun 17 '18 at 15:33
  • What does the `\\1000` mean here? The code works but I don't understand how – stevec Feb 04 '19 at 15:12
  • `\\1` is the first submatch (which is `[0-9]+`, so the numbers before the `k`). `\\1000` then means take that numbers and append `000` to them, changing `5k` to `5000`. – PhilMasteG Apr 24 '19 at 13:43
1

You may use the following solution to handle K, M and G (and more if you want, just adjust the ToDigits function):

> library(gsubfn)
> x <- "0.56K 50K 1.5M 56.56G"
> ToDigits <- function(s) {ifelse(s=="K", 1000, ifelse(s=="M", 1000000, 1000000000)) }
> gsubfn("(\\d*\\.?\\d+)([KMG])", function(y,z) as.numeric(y) * ToDigits(z), x)
[1] "560 50000 1500000 5.656e+10"

Here, (\\d*\\.?\\d+)([KMG]) captures 0+ digits, . and 1+ digits into Group 1 and then K or M or G into Group 2, and gsubfn is used to manipulate the found match in such a way that the found number is multipled with the right value obtained using a simple helper ToDigits function (if K is in Group 2, multiply with 1000, etc.)

To make it case insensitive, you may tweak the code above as

> ToDigits <- function(s) {ifelse(tolower(s)=="k", 1000, ifelse(tolower(s)=="m", 1000000, 1000000000)) }
> gsubfn("(\\d*\\.?\\d+)([KMGkmg])", function(y,z) as.numeric(y) * ToDigits(z), x)
[1] "560 50000 1500000 5.656e+10"
Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563