2

I am having troubles in find out frequency of the highest repeated digit within a number. I want output as following:

Number         Output
1111125436     5
9999266613     4
2346275210     3
1234567890     1

And so on.

I have tried freqency, Biostringsbut couldn't do it. Appreciate your help.

  • @thelatemail Thanks for noticing that. – Sourav Sarkar Feb 18 '16 at 05:23
  • This problem can be done in O(n) time using Moore's Linear Time Majority Vote Algorithm http://www.cs.utexas.edu/~moore/best-ideas/mjrty/. Python implementation here: http://stackoverflow.com/questions/27652492/python-find-majority-number-in-on-time-and-o1-memory – kilojoules Feb 18 '16 at 05:47
  • @kilojoules R question. –  Feb 18 '16 at 06:20

3 Answers3

5

There could well be better ways than this, but splitting the number as a character string and table-ing seems like a possibility:

vapply(strsplit(as.character(dat$Number),""), function(x) max(table(x)), FUN.VALUE=1L)
#[1] 5 4 3 1
thelatemail
  • 91,185
  • 12
  • 128
  • 188
4

A possible base R solution:

df <- data.frame(Number = c(1111125436, 9999266613, 2346275210, 1234567890))
df$Output <- sapply(df$Number, function(x) tail(sort(table(strsplit(as.character(x), ''))), 1))
df
#       Number Output
# 1 1111125436      5
# 2 9999266613      4
# 3 2346275210      3
# 4 1234567890      1
daroczig
  • 28,004
  • 7
  • 90
  • 124
  • 1
    `max(...)` is shorter to write and probably runs faster than `tail(sort(...),1)` – Adam Hoelscher Feb 18 '16 at 05:43
  • @Adam yeah, I have seen (and upvoted) the other answer using `max`, which will probably get more upvotes than mine -- but I'm leaving this alternative solution here as is. – daroczig Feb 18 '16 at 17:52
3

Here is another option with stri_count and pmax

library(stringi)
do.call(pmax,lapply(0:9, stri_count_fixed, str=df1$Number))
#[1] 5 4 3 1

Or with rowMaxs/stri_count

library(matrixStats)
rowMaxs(sapply(0:9, stri_count_fixed, str=df1$Number))
#[1] 5 4 3 1
akrun
  • 874,273
  • 37
  • 540
  • 662