3

Given an n integer (n >= 1), and a number k, return all possible ways to write n as kth power different integers. For instance, if n = 100 and k = 2:

100 = 1**2 + 3**2 + 4**2 + 5**2 + 7**2
    = 6**2 + 8**2
    = 10**2

or if k = 3:

100 = 1**3 + 2**3 + 3**3 + 4**3

So program(100,2) returns something like [(2, [1, 3, 4, 5, 7]), (2, [6, 8]), (2, [10])], and program(100,3) [(3, [1, 2, 3, 4])].

Everything works fine, as long as the input n is small, or k is "big" (>=3). My approach was to first get a list of all integers, whose kth power is <= n.

def possible_powers(n,k):
    arr,i = [],1
    while i**k <= n:
        arr.append(i)
        i += 1
    return arr

Then (and here's the mistake), I created all subsets of this list (as lists):

def subsets(L):
    subsets = [[]]
    for i in L:
        subsets += [subset+[i] for subset in subsets]
    return subsets

And finally I cycled through all these subsets, raising each element to the kth power and adding them together, selecting only the ones that add up to n.

def kth_power_sum(arr,k):
    return sum([i**k for i in arr])

def solutions(n,k):
    return [(k,i) for i in subsets(possible_powers(n,k)) if kth_power_sum(i,k) == n]

I know the problem is with the subset-creation, but I have no clue how to optimize this. Like, if I try solutions(1000,2), it creates a large set, which takes up more than 4GB memory. My guess would be to sieve out a few subsets, but that wouldn't help much, unless I have a very efficient sieve.

Any help is greatly appreciated. If something isn't clear, or I made a mistake in posting this, please tell me.

President James K. Polk
  • 40,516
  • 21
  • 95
  • 125
  • 1
    I don't have time to dig into this right now, but have a look at this: https://stackoverflow.com/questions/1482308/how-to-get-all-subsets-of-a-set-powerset – Swifty May 05 '23 at 09:41
  • In [`itertools` documentation](https://docs.python.org/3/library/itertools.html#itertools-recipes) you have how to write the function `powerset` lazily evaluated: only one subset is stored in memory at a time. – Jorge Luis May 05 '23 at 09:47
  • Thanks for the suggestion @Swifty. I skimmed through that post, but I don't think my powerset function is bad (am I wrong?), but the fact that I create an entire powerset, which in case of subsets(1000,2) if a 2**31 long list (not very optimal). – Reggie Floarde May 05 '23 at 09:51
  • I think the way to go here is with some DP algorithm – Ohad Sharet May 05 '23 at 09:55
  • Take a look at this answer: [Finding all possible combinations of numbers to reach a given sum](https://stackoverflow.com/a/4633515) modify it to use powers seems pretty straightforward. – Jorge Luis May 05 '23 at 10:00
  • @JorgeLuis so you're saying that my program stores the entire powerset, messing up the memory, but the `itertools` powerset stores only one of the subsets and cycles through them? – Reggie Floarde May 05 '23 at 10:02
  • @ReggieFloarde Yes. Only stores one at a time. – Jorge Luis May 05 '23 at 10:03
  • @JorgeLuis Thank you. My only concern is the time now. Is the `itertools` version slower or faster, or is it the same as mine? – Reggie Floarde May 05 '23 at 10:08
  • @ReggieFloarde Quite faster. For a list of size 6 on my machine yours is almost 60% slower than `itertools`. Which will get worse in bigger lists. But is not fast enough for `solutions(1000, 2)`. – Jorge Luis May 05 '23 at 10:20

5 Answers5

5

If you implement this as a recursive generator, you won't need to store a large number of values (not even the results):

def powersum(n,k,b=1):
    bk = b**k
    while bk < n:
        for bases in powersum(n-bk,k,b+1):
            yield [b]+bases
        b += 1
        bk = b**k
    if bk == n :
        yield [b]
        
print(*powersum(100,2)) # [1, 3, 4, 5, 7] [6, 8] [10]
print(*powersum(100,3)) # [1, 2, 3, 4]
print(sum(1 for _ in powersum(1000,2))) # 1269 solutions
print(sum(1 for _ in powersum(2000,2))) # 27526 solutions (6 seconds)     

Note that this still has an exponential time complexity so it will be much slower for slightly larger values of n

print(sum(1 for _ in powersum(2200,2))) # 44930 solutions (12 seconds)
print(sum(1 for _ in powersum(2500,2))) # 91021 solutions (25 seconds)
print(sum(1 for _ in powersum(2800,2))) # 175625 solutions (55 seconds)
print(sum(1 for _ in powersum(3100,2))) # 325067 solutions (110 seconds)

[EDIT] For future reference, here is Kelly Bundy's cached version which runs considerably faster. Posting it here in case his demo link gets broken:

from functools import cache
from time import time

@cache                  
def powersum(n,k,b=1):
    bk = b**k
    res = []
    while bk < n:
        for bases in powersum(n-bk,k,b+1):
            res.append([b]+bases)
        b += 1
        bk = b**k
    if bk == n :
        res.append([b])
    return res

Note: although the cache will consume some memory, it will be nowhere near the size the full powerset of all possible combinations.

Alain T.
  • 40,517
  • 4
  • 31
  • 51
  • Nice answer. 2000,2 in 6 seconds is very impressive – Amit May 05 '23 at 14:54
  • You could speed it up by returning lists instead of iterators and decorating with `functools.cache`. – Kelly Bundy May 06 '23 at 15:29
  • I did try lru_cache with list outputs but, but it didn't seem to make a meaningful difference (either I didn't do it right or there isn't enough reuse of parameter patterns). In any case, I was aiming at minimal memory usage for this particular answer. – Alain T. May 06 '23 at 19:48
  • What limit did you use? With @cache, I get 2000,2 in 0.4 seconds and 3100,2 in 3 seconds. [Demo](https://ato.pxeger.com/run?1=VVFBbsMgEFSvvGKPJrajuL1UUZH6jyiqjAMyIl4QYEV9Sy-5pI_KawLYbuo9MTuzo9nl59d-h97g9Xobg6zf7y9aOjOAHLELxpw9qMEaF6Bru16QzAU1iKWd3oR8TuxJSLDmIpwfhwIrXXHW0D2BWFwDA77Z6Iyc8BEejhlcenUWSfABOIlTSeOAtz4KFf4zrblOvuXiu1R03LbWCjwVB34s8yT9U3AoGTRPuAqjZG4wQHh6rv3onDqMDhNFiA9t3J7l_QtKrFMYipSwycm_Vqnfmt2ueqW0muVQQ56n083n0y9f8AA). – Kelly Bundy May 06 '23 at 20:24
  • I'm using python 3.7 and forgot to specify a number in `@lru_cache()` so I didn't see an improvement and came to the wrong conclusion. Giving it a million (instead of the default 128) does provide the expected performance boost. – Alain T. May 08 '23 at 15:36
  • Ok :-). Or use `lru_cache(None)`. Some statistics: for 2000,2 the uncached function gets called 8,944,901 times. When cached, it gets called only 31,840 times, and `powersum.cache_info()` reports`CacheInfo(hits=182585, misses=31840, maxsize=None, currsize=31840)`. – Kelly Bundy May 08 '23 at 16:05
1

your problem is that you are keeping too many irrelevant combinations in subsets, I wrote here some code that saves only the relevant lists, so that should do the trick, (I wrote it here in multiple lines so it will be easier to understand )

def my_solutions(n,k):
    subsets = []
    kth_root = int(n**(1/k))# kth_root of n is the upper bound
    relevant_list = range(1,kth_root+1)
    for comb_size in range(1,kth_root+1):
        for comb in combinations(relevant_list, comb_size):
            sum_comb = sum([x**k for x in comb])
            if (sum_comb == n):
                subsets.append(list(comb))
    return subsets

here is the same code but written in one-liner to maximize efficiency

def my_solutions_one_linner(n,k):
    subsets = []
    kth_root = int(n**(1/k))# kth_root of n is the upprt bound
    relevant_list = range(1,kth_root+1)
    ans =  [[list(comb) for comb in combinations(relevant_list, comb_size) if sum([x**k for x in comb]) == n] for comb_size in range(1,kth_root+1)]
    return [lst for lst in ans if lst!=[]]
Ohad Sharet
  • 1,097
  • 9
  • 15
  • would you say that the optimal solution would be to use map , or do you have another idea ? edited: I just tried using map instead and got slightly worse results, so if you know a better way to solve this problem I would like to know :) – Ohad Sharet May 05 '23 at 10:38
  • thanks for the tip, that was a good one :) – Ohad Sharet May 05 '23 at 11:10
  • 1
    I just tried it and for some reason, it was also slightly slower (but I also thought it would be faster ), anyway, thank you for the tip , I am trying to write a more efficient python code and i really appreciate any help i can get :) – Ohad Sharet May 05 '23 at 11:26
  • I posted a faster code. – Amit May 05 '23 at 14:35
1

My solution is based on this answer to the question Finding all possible combinations of numbers to reach a given sum. The goal in the linked question is reaching a value target by adding elements in a list numbers. In our case, numbers will be the integers from 1 to target**(1/k) raised to k.

This is the function from the other answer. The only modification is that we know the numbers are in increasing order. This means we can stop the iteration earlier. If numbers[0] is too big, so will be the rest of the following powers.

def subset_sum(numbers: list[int], target: int, partial: list[int]=[], partial_sum: int=0):
    if partial_sum == target:
        yield partial
        return
    if numbers and (partial_sum + numbers[0]) > target:
        return
    for i, n in enumerate(numbers):
        remaining = numbers[i + 1:]
        yield from subset_sum(remaining, target, partial + [n], partial_sum + n)

Now lets call this function properly. The important point here is that subset_sum needs the powers, but our solution will use the numbers without rising them. To get so, let's define a dictionary where the keys are the powers and the values are the numbers that will be in our answer. After so, we just have to give subset_sum a list with our dictionary keys.

def solutions(n, k):
    kth_root = next(i for i in itertools.count() if i**k>n) - 1
    powers_dict = {i**k: i for i in range(1, kth_root+1)}
    powers = list(powers_dict.keys())
    return [[powers_dict[i] for i in answer] for answer in subset_sum(powers, n)]

On my machine this code took 9 seconds to complete solutions(2000, 2).

Jorge Luis
  • 813
  • 6
  • 21
1

@Reggie-florade and @Ohad-sharet, you can make the code run faster. Please see the below code. My solution is very much similar to @Ohad-sharet but runs significantly faster. My function name is "kth_power_and_n".

from itertools import combinations
import time
import math
import pandas as pd

def my_solutions_one_linner(n,k):
    subsets = []
    kth_root = int(n**(1/k))# kth_root of n is the upprt bound
    relevant_list = range(1,kth_root+1)
    ans =  [[list(comb) for comb in combinations(relevant_list, comb_size) if sum([x**k for x in comb]) == n] for comb_size in range(1,kth_root+1)]
    return [lst for lst in ans if lst!=[]]  

def kth_power_and_n(n, k):
    UpperLimit = int(math.floor(n**(1/k)))
    Matrix = list(map(lambda x: x**k,range(1,UpperLimit+1)))
    Final_Solution_List = []
    for i in range(1,len(Matrix)+1):
        Generated_Combinations = combinations(Matrix,i)
        for eachCombination in Generated_Combinations:
            if sum(eachCombination) == n:
                Final_Solution_List.append(sorted(list(map(lambda x: int(round(x**(1/k))), eachCombination))))
    return Final_Solution_List

Solution_Dict = {"n":[],"k":[],"time taken 1":[],"time taken 2":[],"Solution Found 1":[],"Solution Found 2":[]}

                
for n in [50,100,250,500]:
    for k in [2,3,4]:
        ## Previous Solution time taken
        Start = time.time()
        Solution = my_solutions_one_linner(n,k)
        Time_Taken = time.time()-Start
        Solution_Dict["n"].append(n)
        Solution_Dict["k"].append(k)
        Solution_Dict["time taken 1"].append(Time_Taken)        
        if Solution:
            Solution_Dict["Solution Found 1"].append(True)
        else:
            Solution_Dict["Solution Found 1"].append(False)

        ## Significantly faster code
        Start = time.time()
        Solution = kth_power_and_n(n,k)
        Time_Taken = time.time()-Start
        Solution_Dict["time taken 2"].append(Time_Taken)        
        if Solution:
            Solution_Dict["Solution Found 2"].append(True)
        else:
            Solution_Dict["Solution Found 2"].append(False)


DF_One_Liner = pd.DataFrame(data=Solution_Dict)
DF_One_Liner["Time Gain Ratio"] = DF_One_Liner["time taken 1"]/DF_One_Liner["time taken 2"]
print(DF_One_Liner)

Below is a time comparison of two methods.

      n  k  time taken 1  time taken 2  Solution Found 1  Solution Found 2  Time Gain Ratio
0    50  2      0.000170      0.000047              True              True         3.627551
1    50  3      0.000012      0.000007             False             False         1.758621
2    50  4      0.000007      0.000005             False             False         1.333333
3   100  2      0.001503      0.000185              True              True         8.142119
4   100  3      0.000019      0.000011              True              True         1.723404
5   100  4      0.000010      0.000006             False             False         1.833333
6   250  2      0.067290      0.006028              True              True        11.163476
7   250  3      0.000073      0.000018             False             False         4.135135
8   250  4      0.000010      0.000006             False             False         1.720000
9   500  2     11.933419      0.870651              True              True        13.706314
10  500  3      0.000185      0.000036             False             False         5.132450
11  500  4      0.000019      0.000009             False             False         2.135135
Amit
  • 2,018
  • 1
  • 8
  • 12
0

Ok, just for the fun I tried to write a recursive function to do this; here's the result (power_sum(2000, 2) answers in about 10s on my machine, so though it's probably not optimal, it's still not too bad).

from math import inf
def power_sums(total, p, max_allowed=inf):
    nmax = min(int(total**(1/p)), max_allowed)
    for i in range(nmax, 0, -1):
        if (ip := i**p) == total:
            yield [i]
        else:
            for x in power_sums(total - ip, p, i-1):
                yield [i] + x
Swifty
  • 2,630
  • 2
  • 3
  • 21