5

For a given number N find total possible ordered pair (x,y) such that x and y are less than or equal to n and sum of digits of x is less than sum of digits of y

for example n=6: there are 21 possible ordered pair which are [(0, 1), (0, 2), (0, 3), (0, 4), (0, 5), (0, 6), (1, 2), (1, 3), (1, 4), (1, 5), (1, 6), (2, 3), (2, 4), (2, 5), (2, 6), (3, 4), (3, 5), (3, 6), (4, 5), (4, 6), (5, 6)]

here x is always less than y and sum of digits of x is also less than sum of digits of y and both x and y are equal or less than N.Here is my naive approach but this is pretty slow and works fine till N=10000 after that it performs badly.

from itertools import permutations
n=100
lis=list(range(n+1))
y=list(i for i in permutations(lis,2) if i[0]<i[1] and sum(list(map(int,
(list(str(i[0]))))))<sum(list(map(int,(list(str(i[1])))))))
print(len(y))

One using generators

from itertools import permutations
for _ in range(int(input())):
    n=1000
    lis=range(n+1)
    y=(i for i in permutations(lis,2) if i[0]<i[1] and sum(list(map(int,
     (list(str(i[0]))))))<sum(list(map(int,(list(str(i[1])))))))
    print (sum(1 for _ in y))

A better improved version:

from itertools import permutations
for _ in range(int(input())):
    n=1000
    lis=range(n+1)
    y=(i for i in permutations(lis,2) if i[0]<i[1] and sum(map(int,(str(i[0]))))<sum(map(int,(list(str(i[1]))))))
    print (sum(1 for _ in y))

Is there a better approach to tackle this problem?

Om Sharma
  • 335
  • 1
  • 11

3 Answers3

2

How the code works

This is almost exclusively algorithmic improvements over your method. It may be faster to use generators or list comprehensions, but you'd have to profile it to check. The algorithm works as follows:

  1. Precompute the digit sums of 1 - N.
  2. Group the numbers 1 - N by their digit sum. We have an object that looks like this. Thus if we want to get numbers with digit sum >2, we only need to count the numbers after the third row.

1: 1, 10
2: 2, 11, 20
3: 3, 12, 21, 30 ...

  1. Observe that the numbers within each row are in sorted order. If our number is 12, we only need to look at numbers after 12. We can find the 12s in each row with a binary search.

Overall, this is a ~20x improvement over your algorithm, with O(N) memory cost

The code

import time
import bisect
import itertools

N = 6

def sum_digits(n):
    # stolen from here: https://stackoverflow.com/questions/14939953/sum-the-digits-of-a-number-python
    # there may be a faster way of doing this based on the fact that you're doing this over 1 .. N
    r = 0
    while n:
        r, n = r + n % 10, n // 10
    return r        

t = time.time()
# trick 1: precompute all of the digit sums. This cuts the time to ~0.3s on N = 1000
digit_sums = [sum_digits(i) for i in range(N+1)]
digit_sum_map = {}

# trick 2: group the numbers by the digit sum, so we can iterate over all the numbers with a given digit sum very quickly
for i, key in enumerate(digit_sums):
    try:
        digit_sum_map[key].append(i)
    except KeyError:
        digit_sum_map[key] = [i]
max_digit_sum = max(digit_sum_map.keys())

# trick 3: note that we insert elements into the digit_sum_map in order. thus we can binary search within the map to find
# where to start counting from. 
result = []
for i in range(N):
    for ds in range(digit_sums[i] + 1, max_digit_sum + 1):
        result.extend(zip(itertools.repeat(i), digit_sum_map[ds][bisect.bisect_left(digit_sum_map[ds], i):]))

print('took {} s, answer is {} for N = {}'.format(time.time() - t, len(result), N))
# took 0.0 s, answer is 21 for N = 6
# took 0.11658287048339844 s, answer is 348658 for N = 1000
# took 8.137377977371216 s, answer is 33289081 for N = 10000

# for reference, your last method takes 2.45 s on N = 1000 on my machine
c2huc2hu
  • 2,447
  • 17
  • 26
0
def find_total_possible(n):
    x = [i for i in range(n + 1)]
    y = [i + 1 for i in range(n + 1)
    z = list(zip(x,y))
    return z

Is this homework?

It smells like homework.

Quentin
  • 700
  • 4
  • 10
  • and your script produces wrong output and at the same time like i said this will not work for huge N but i appreciate your effort :) – Om Sharma Nov 12 '17 at 02:08
  • Hm... A database table may be the best approach if you want speed. Also, could you add more details in terms of accuracy? x < y & sum(x) < sum(y) &Xn < Yn < N What is missing? – Quentin Nov 12 '17 at 02:14
  • what you are doing is zipping for a particular value of x and y say (1,2) but the solution can range from (1,2) to (1,n) excluding (1,10) because there sum is same and this will continue and more exclusions will follow. – Om Sharma Nov 12 '17 at 02:20
0

One problem is that you're still generating all permutations and then removing the entries where x is greater than or equal to y. The other issue is you're recalculating the sum of the digits of y for every iteration when this could be stored and compared later. There might be a more elegant solution where you can essentially break out of the nested loop if you know all future x entries will fail the criteria.

from itertools import permutations
from time import time

def time_profile(f):

    def time_func(*args, **kwargs):
        start_time = time()
        r = f(*args, **kwargs)
        end_time = time()
        print "{} Time: {}".format(f, end_time - start_time)
        return r

    return time_func

@time_profile
def calc1(n):
    lis=list(range(n+1))
    y=list(i for i in permutations(lis,2) if i[0]<i[1] and sum(list(map(int,
    (list(str(i[0]))))))<sum(list(map(int,(list(str(i[1])))))))
    return y

@time_profile
def calc2(n):
    l = []
    for y in xrange(n+1):
        y_sum = sum(map(int, str(y)))
        for x in xrange(y):
            # May be possible to use x_digits to break
            x_digits = map(int, str(x))
            if sum(x_digits) >= y_sum: continue
            l.append((x, y))

    return l

if __name__ == '__main__':
    N = 10000
    if len(calc1(N)) != len(calc2(N)): print 'fail'

< function calc1 at 0xfff25cdc > Time: 233.378999949

< function calc2 at 0xfff2c09c > Time: 84.9670000076

Some other points that are not related to the question. Some of your calls to list are redundant. The map function already returns a list. In Python 3, range returns a generator which returns a value as you iterate through it. It's more memory efficient and will work the same.