6

So I'm trying to write a python function that takes in two arguments, n and num, and counts the occurrences of 'n' between 0 and num. For example,

countOccurrences(15,5) should be 2.

countOccurrences(100,5) should be 20.

I made a simple iterative solution to this problem:

def countOccurrences(num,n):
  count=0
  for x in range(0,num+1):
    count += countHelper(str(x),n)
  return count

def countHelper(number,n):
  count=0
  for digit in number:
    if digit==n:
      count += 1
  return count

This ran into obvious problems if I tried to call countOccurrences(100000000000,5). What my question is is how can I make this more efficient? I want to be able to handle the problem "fairly" fast, and avoid out of memory errors. Here is my first pass at a recursive solution trying to do this:

def countOccurence(num, n):
  if num[0]==n:
    return 1
  else:
    if len(num) > 1:
      return countOccurence(num[1:],n) + countOccurence(str((int(num)-1)),n)
    else:
      return 0
Cœur
  • 37,241
  • 25
  • 195
  • 267
catagon87
  • 548
  • 5
  • 21
  • If this is Python 2.x, use `xrange`. Recursion is just going to mean you hit the system recursion limit. – jonrsharpe Jun 29 '15 at 15:44
  • 1
    I believe the proper solution is probably much cleverer than this. I imagine this can be done in O(log n) if not O(1). – Kevin Jun 29 '15 at 15:50
  • 1
    I agree with other Kevin. I recall seeing this question on a programming challenge site (Project Euler???), and the solution was recursive and logarithmic. – Kevin Jun 29 '15 at 15:55
  • possible duplicate of [How to count each digit in a range of integers?](http://stackoverflow.com/questions/2059680/how-to-count-each-digit-in-a-range-of-integers) – Pham Trung Jun 29 '15 at 16:13
  • 2
    You can use a solution very similar to this to get O(log n): http://stackoverflow.com/questions/22394257/how-to-count-integers-between-large-a-and-b-with-a-certain-property/22394258#22394258 – Niklas B. Jun 29 '15 at 16:14
  • is this the inverse question http://stackoverflow.com/questions/25751327/given-a-stock-of-integers-0-9-what-is-the-last-number-i-can-write-before-i-run ? – גלעד ברקן Jun 29 '15 at 21:57

3 Answers3

2

This won't hit into any memory problems, until max_num is small enough to fit in a C long. Basically it's still a brute-force algorithm, though significantly optimized for Python.

def count_digit(max_num, digit):
    str_digit = str(digit)
    return sum(str(num).count(str_digit) for num in xrange(max_num+1))
Eli Korvigo
  • 10,265
  • 6
  • 47
  • 73
0

I have fixed my solution, and hopefully it fits your specifications. Let's go through the first helper function:

def splitNumber(num):
    temp = str(number)
    nums = list(temp)
    return nums

This function creates a string listing of all the individual numbers in the number input. For example,

splitNumber(100)

would return:

['1', '0', '0']

From here, you call the main function and test each individual number with this main function:

def countOccurences(num, n):
    count = 0
    for x in range(0, (num + 1)):
        temp = splitNumber(x)
        for x in range(len(temp)):
            if (temp[x] == str(n)):
                count = count + 1
    return count

Which should give the desired output. Let me know if this works for you!

  • 1
    The memory problem is still there in your solution. Above that, you don't get rid of extra Python function calls and `for` loops. In other words, your solution is as inefficient as the OP's one. – Eli Korvigo Jun 29 '15 at 21:05
  • This is just a slower version of the solution above which itself is slow – arjunsiva Nov 16 '21 at 12:14
0

See: https://math.stackexchange.com/a/1229292/974150

in python:

def counts_of_i_bf(n, i):
    """Counts the number of occurences in a range [0 .. n] of
        the digit i [0...9]

    Args:
        n ([int]): upper value of range [0 ... n]
        i ([type]): digit looking for [0.. 9]

    Returns:
        [int]: occurences of i in the range [0...n]
    """
    return sum(str(d).count(str(i)) for d in range(n + 1))

def counts_of_i_dp(n, i):
    """Counts the number of occurences in a range [1 .. n] of
        the digit i [1...9] by implementing the recurrence
        relation:
                        | ak.10^(k-1) + fi(b)           if a < i
        fi(a.10^k +b) = | ak.10^(k-1) + 1 + fi(b) + b   if a == i
                        | (ak + 10).10^(k-1) + fi(b)    if a > i

    see: https://math.stackexchange.com/a/1229292/974150
    Args:
        n ([int]): upper value of range [1 ... n]
        i ([type]): digit looking for [1.. 9]

    Returns:
        [int]: occurences of i in the range [0...n]
    """
    som = 0
    while n > 0:
        k = int(log10(n))
        a = n // 10**k
        b = n - a * 10**k
        if a < i:
            som += a*k*10**(k-1)
        elif a == i:
            som += a*k*10**(k-1) + 1 + b
        else:
            som += (a*k + 10)*10**(k-1)
        n = b
        
    return int(som)

def counts_of_0(n):
    """Counts the number of occurences in a range [1 .. n] of
        the digit0 by implementing:
        Tn = (k + 1)*(b + 1 + (a - 1)10^k) + ∑ 9*s*10(s - 1) for s=1.. k\
        f0(n) = Tn -∑ 9s.10^(s-1) for s=1..9
  
        see: https://math.stackexchange.com/a/1229292/974150
    Args:
        n ([int]): upper value of range [1 ... n]

    Returns:
        [int]: occurences of 0 in the range [1...n]
    """
    k = int(log10(n))
    a = n // 10**k
    b = n - a * 10**k
    Tn = (k + 1)*(b + 1 + (a - 1)*10**k) + sum(9*s*10**(s - 1) for s in range(1, k + 1))
    return Tn - sum(counts_of_i_dp(n, i) for i in range(1, 10)) + 1 # was one of


def counts_of_i(n, i):
    """Counts the number of occurences in a range [0 .. n] of
        the digit i [0...9]
        

    Args:
        n ([int]): upper value of range [0 ... n]
        i ([type]): digit looking for [0.. 9]

    Returns:
        [int]: occurences of i in the range [0...n]
    """
    if n == 0: return 1 if i == 0 else 0
    if i == 0: return counts_of_0(n)
    return counts_of_i_dp(n, i)

assert all(counts_of_i_bf(i, d) == counts_of_i(i, d) for i in range(1_001) for d in range(10))
  • 2
    This looks quite promising, but some explanation of the algorithm would be useful. Also, `counts_of_i_dp(10,0)` returns a value of 11, which can't be right. – r3mainer Sep 29 '21 at 09:16