28

Suppose if we have a number 1.000633, I want to count number of zeros after the decimal point until first nonzero digit in the fraction, the answer should be 3. For 0.002 the answer should be 2.

There is no such function in R that could help. I have explored at Ndec function in package DescTools but it does not do the job.

zx8754
  • 52,746
  • 12
  • 114
  • 209
Annie
  • 681
  • 7
  • 14

7 Answers7

25

Using regexpr and its match.length argument

attr(regexpr("(?<=\\.)0+", x, perl = TRUE), "match.length")
David Arenburg
  • 91,361
  • 17
  • 137
  • 196
  • @DavidArenburg Yes, I didn't think about the possible options where it went wrong. Thanks for the feedback. – akrun Feb 22 '16 at 15:57
  • @Annie see edit, I used `regexpr` instead of `gregexpr` in order to avoid `sapply`. Now it is fully vectorized and much faster. – David Arenburg Feb 22 '16 at 16:16
  • 1
    For `x <- 10.2` this returns -1 instead of 0. I had to insert an `ifelse` statement in my solution to capture a case where it would fail without it. This might be a reason why you consider my implementation complicated. On the other hand, maybe you could consider capturing such cases, too, so that your solution also works for any number. – RHertel Feb 22 '16 at 17:59
  • 1
    @RHertel it always returns `-1` for no match. That is the `regexpr` notation for no match. My solution works for *any* number. – David Arenburg Feb 22 '16 at 18:05
  • 2
    Alright, I understand that - just as I understood why my original post needed a correction. I'm just not sure if this corresponds to the OP's requested output ("..count number of zeros after the decimal point until first nonzero digit..."). A negative count of zeroes does not seem to make much sense to me. – RHertel Feb 22 '16 at 18:06
  • 2
    @RHertel this can be easily fixed in a vectorized way if OP desired, but in this case -1 or 0 seems to be equally fine to me to for no match. – David Arenburg Feb 22 '16 at 18:12
  • Fair enough. After all, the OP decides. – RHertel Feb 22 '16 at 18:13
  • 2
    @RHertel Simply using `(?<=\\.)0+|$` as the regex should do if you want to get 0 instead of -1. – maaartinus Feb 22 '16 at 22:41
  • `attr(regexpr("(?<=\\.)0+", 0.0001, perl = TRUE), "match.length")` returns `-1` instead of `3`. That's not supposed to happen, is it? – LuckyPal Dec 01 '21 at 17:44
16

Here's another possibility:

zeros_after_period <- function(x) {
if (isTRUE(all.equal(round(x),x))) return (0) # y would be -Inf for integer values
y <- log10(abs(x)-floor(abs(x)))   
ifelse(isTRUE(all.equal(round(y),y)), -y-1, -ceiling(y))} # corrects case ending with ..01

Example:

x <- c(1.000633, 0.002, -10.01, 7.00010001, 62.01)
sapply(x,zeros_after_period)
#[1] 3 2 1 3 1
RHertel
  • 23,412
  • 5
  • 38
  • 64
  • @zx8754 better now..? – RHertel Feb 22 '16 at 12:57
  • 2
    I liked this solution even with 0.001 issue. – zx8754 Feb 22 '16 at 12:59
  • I think you forgot to vectorize it as now it only work on a length one vector only... Myabe this should be `ifelse(round(y) == y, -y-1, -ceiling(y))` ? – David Arenburg Feb 22 '16 at 16:26
  • Not columns, just several values, such as `x <- c(0.1, 1.0, 1.001)` – David Arenburg Feb 22 '16 at 17:24
  • I wonder why I have two comments below my answer with the content "it doesn't work". As matter of fact, it does work. – RHertel Feb 22 '16 at 17:24
  • Not sure why are you being so aggressive. My second comment was regarding my previous comment. I was trying to help you vectorize it. Not sure what the point to write such a complicated implementation if this only meant for one value. – David Arenburg Feb 22 '16 at 17:28
  • It is easy to misinterpret things in writing. If you read carefully what I posted, I don't think you'll find something aggressive. And if you do, it was not my intention. Now, emotions aside, concerning your criticism that this answer is complicated: it may be in the eye of the beholder whether a mathematical expression is more complicated than some regex formulas. I prefer maths, especially when the object under consideration is a number. It is a function that spans two lines. Complicated? I don't think so. – RHertel Feb 22 '16 at 17:32
  • `y = log10(abs(x) %% 1)` also seems to work. To make it vectorized, `y = -log10(abs(x) %% 1); ceiling(y) - ( (y %% 1) < 10^-options()$digits )` or using some other threshold, I guess. There is probably still an edge case or two – Frank Feb 22 '16 at 18:57
  • http://stackoverflow.com/questions/35553244/count-leading-zeros-between-the-decimal-point-and-first-nonzero-digit/35553571#comment58831988_35559346 – Roland Feb 23 '16 at 08:39
  • @Roland Thanks for the link to your comment. That answer certainly deserves an upvote. – RHertel Feb 23 '16 at 08:48
  • You can increase the number of detectable digits using this solution by altering the default value of tolerate in all.equal() to, for example, .Machine$double.eps – Scott Kaiser Mar 23 '20 at 22:37
9

We can use sub

ifelse(grepl("\\.0", str1), 
    nchar(sub("[^\\.]+\\.(0+)[^0]+.*", "\\1", str1)), NA)
#[1] 3 2 3 3 2

Or using stringi

library(stringi)
r1 <- stri_extract(str1, regex="(?<=\\.)0+")
ifelse(is.na(r1), NA, nchar(r1))
#[1] 3 2 3 3 2

Just to check if it works with any strange cases

str2 <- "0.00A-Z"
nchar(sub("[^\\.]+\\.(0+)[^0]+.*", "\\1", str2))
#[1] 2

data

str1 <- as.character(c(1.000633, 0.002, 0.000633,
                                  10.000633, 3.0069006))
akrun
  • 874,273
  • 37
  • 540
  • 662
  • @Annie Can you please check it again. Based on the example I posted, it is not counting – akrun Feb 22 '16 at 12:22
  • 2
    Thanks again, try with str1 <- as.character(10.000633). – Annie Feb 22 '16 at 12:25
  • BTW, library(stringi) nchar(stri_extract(str1, regex="(?<=\\.)0+")) This works like a magic :). Thanks much. – Annie Feb 22 '16 at 12:26
  • 1
    You probably need to edit your first solution as it is wrong. – David Arenburg Feb 22 '16 at 12:31
  • @DavidArenburg Yes, I forgot about `+` – akrun Feb 22 '16 at 12:32
  • @DavidArenburg I am gettiing 3 as the result – akrun Feb 22 '16 at 12:33
  • 2
    @akrun there could be any number there, and this should work for all numbers. Almost everyone have a comment under their answers with possible issues, not just you. See [here](http://stackoverflow.com/questions/35553244/count-leading-zeros-between-the-decimal-point-and-first-nonzero-digit/35553571#comment58794809_35553373) and [here](http://stackoverflow.com/questions/35553244/count-leading-zeros-between-the-decimal-point-and-first-nonzero-digit/35553571#comment58795433_35553441) for instance – David Arenburg Feb 22 '16 at 12:38
  • 8
    What do you mean with: *"Okay, Jaap is also online"*? – Jaap Feb 22 '16 at 12:38
  • 2
    just allow for digits apart from 0 in ther rest of number, like `"[^\\.]+\\.(0+)[^0]{1}.*"` and it'll be find (though I'll still prefer the `numeric` approach of RHertel). It's a matter of accurate solution, not of upvote – Cath Feb 22 '16 at 12:38
  • @Jaap It means `you are online`. Did I say anything bad? – akrun Feb 22 '16 at 12:41
  • No, I just found it to be an odd statement in a comment addressing @Cath – Jaap Feb 22 '16 at 12:48
  • You should do the same with the `stringi` solution as it also counts `NA`s as two characters... Sorry for being annoying :) – David Arenburg Feb 22 '16 at 16:51
  • @DavidArenburg In your solution, it returns `-1` – akrun Feb 22 '16 at 16:55
  • Yes, when there is no match it gives -1 – David Arenburg Feb 22 '16 at 16:57
7

Using rle function:

#test values
x <- c(0.000633,0.003,0.1,0.001,0.00633044,10.25,111.00012,-0.02)

#result
sapply(x, function(i){
  myNum <- unlist(strsplit(as.character(i), ".", fixed = TRUE))[2]
  myNumRle <- rle(unlist(strsplit(myNum, "")))
  if(myNumRle$values[1] == 0) myNumRle$lengths[1] else 0
})

#output
# [1] 3 2 0 2 2 0 3 1
zx8754
  • 52,746
  • 12
  • 114
  • 209
7

Another way using str_count from stringr package,

 x <- as.character(1.000633)
 str_count(gsub(".*[.]","",x), "0")
 #[1] 3

EDIT: This counts all zeros after decimal and until first non-zero value.

y <- c(1.00215, 1.010001, 50.000809058, 0.1)
str_count(gsub(".*[.]","",gsub("(?:(0+))[1-9].*","\\1",as.character(y))),"0")
#[1] 2 1 3 0
Sotos
  • 51,121
  • 6
  • 32
  • 66
7
floor( -log10( eps + abs(x) - floor( abs( x ) ) ) )
Toby Speight
  • 27,591
  • 48
  • 66
  • 103
  • 4
    Welcome to Stack Overflow, and thanks for answering this question. Because code with no comments tends not to be very educational, we'd like you to add some explanation of how this answers the question. Thanks! – Toby Speight Feb 22 '16 at 17:06
  • 3
    Yes, this is the best solution here. However, you should account for integer log values like this: `count0 <- function(x, tol = .Machine$double.eps ^ 0.5) { x <- abs(x); y <- -log10(x - floor(x)); floor(y) - (y %% 1 < tol) }` – Roland Feb 23 '16 at 08:35
  • great answer, I love how it's the more mathematical approach and not all grep, thanks! – Antoni Oct 02 '22 at 16:19
0

You can use sub since we do not need to jump. Thus no need of gsub

 nchar(sub(".*\\.(0*).*","\\1",str1))
[1] 3 2 3 3 2

where

str1 <- as.character(c(1.000633, 0.002, 0.000633,
                   10.000633, 3.0069006))
Onyambu
  • 67,392
  • 3
  • 24
  • 53