3

I would like to format number so that the every thousand should be separated with a space.

What I've tried :

library(magrittr)
addSpaceSep <- function(x) {
  x %>% 
    as.character %>% 
    strsplit(split = NULL) %>% 
    unlist %>% 
    rev %>% 
    split(ceiling(seq_along(.) / 3)) %>% 
    lapply(paste, collapse = "") %>% 
    paste(collapse = " ") %>% 
    strsplit(split = NULL) %>% 
    unlist %>% 
    rev %>% 
    paste(collapse = "")
}

> sapply(c(1, 12, 123, 1234, 12345, 123456, 123456, 1234567), addSpaceSep)
[1] "1"         "12"        "123"       "1 234"     "12 345"    "123 456"   "123 456"  
[8] "1 234 567"
> sapply(c(1, 10, 100, 1000, 10000, 100000, 1000000), addSpaceSep)
[1] "1"      "10"     "100"    "1 000"  "10 000" "1e +05" "1e +06"

I feel very bad to have written this makeshift function but as I haven't mastered regular expressions, it's the only way I found to do it. And of course it won't work if the number is converted in a scientific format.

josliber
  • 43,891
  • 12
  • 98
  • 133
Julien Navarre
  • 7,653
  • 3
  • 42
  • 69
  • Can I ask why you are trying to do this? Is it mostly for _display_ or mostly for _manipulation_ (i.e. are you trying to make a pretty report with spaces between each thousand, or are you trying to do some unusual mathematical operations in R?)? – TARehman Jun 04 '15 at 15:43
  • It's only for aesthetics purpose ! – Julien Navarre Jun 04 '15 at 15:44
  • Indeed it's a duplicated. But I don't think this should be closed. I'm not a very good english communicator and for this reason sometimes it's hard to find what you are looking for on SO (I only found a C solution..). This question with a different title could allow others users to find a solution if they use different search terms ! (and avoid them to post their own ugly makeshift function ;)) – Julien Navarre Jun 04 '15 at 16:07
  • 2
    Other users will find a solution even easier if it's closed as a duplicate, since that links the two questions together. There's really no harm in doing it. – Frank Jun 04 '15 at 19:17

3 Answers3

12

This seems like a much better fit for the format() function rather than bothering with regular expressions. The format() function exists to format numbers

format(c(1, 12, 123, 1234, 12345, 123456, 123456, 1234567), big.mark=" ", trim=TRUE)
# [1] "1"         "12"        "123"       "1 234"     "12 345"    "123 456"  
# [7] "123 456"   "1 234 567"
format(c(1, 10, 100, 1000, 10000, 100000, 1000000), big.mark=" ", scientific=FALSE, trim=TRUE)
# [1] "1"         "10"        "100"       "1 000"     "10 000"    "100 000"  
# [7] "1 000 000"
MrFlick
  • 195,160
  • 17
  • 277
  • 295
10
x<-100000000
prettyNum(x,big.mark=" ",scientific=FALSE)
[1] "100 000 000"
user227710
  • 3,164
  • 18
  • 35
7

I agree with the other answers that using other tools (such as format) is the best approach. But if you really want to use a regular expression and substitution, then here is an approach that works using Perl's look ahead.

> test <- c(1, 12, 123, 1234, 12345, 123456, 1234567, 12345678)
> 
> gsub('(\\d)(?=(\\d{3})+(\\D|$))', '\\1 ', 
+      as.character(test), perl=TRUE)
[1] "1"          "12"         "123"        "1 234"     
[5] "12 345"     "123 456"    "1 234 567"  "12 345 678"

Basically it looks for a digit that is followed by 1 or more sets of 3 digits (followed by a non-digit or the end of string) and replaces the digit with itself plus a space (the look ahead part does not appear in the replacement because it is not part of the match, more a condition on the match).

Greg Snow
  • 48,497
  • 6
  • 83
  • 110