1

I have two columns one has 22 Names and another column with respective credit against each name. I need to know all the combination of 11 names which sums up to 100.

i am attaching image of data for example.

I searched and returned with itertool which gets you all the combination. But I was not sure of how this can be used to get all the combination.

import pandas as pd
import itertools
from itertools import combinations
from itertools import permutations
data=pd.read_excel(r'C:\Users\anup.kumar.dutta\Documents\Test\Test.xlsx',index=False)
def f(length,total_sum):
    if length == 1:
        yield (total_sum,)
    else:
        for value in range(total_sum + 1):
            for permutation in sum(length - 1,total_sum - value):
                yield (value,) + permutation

L = list(f(5,100))
print('total permutations:',len(L))

It should return all combination of names which sums up to credit 100.

Anup Dutta
  • 65
  • 6
  • 1
    [Don't post images](https://meta.stackoverflow.com/questions/285551/why-not-upload-images-of-code-on-so-when-asking-a-question) of your data, we cannot copy and paste it to answer your question. Try to add a small example dataframe which represents your actual data. See [this](https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples) thread on how to make a good pandas question in specific – Erfan May 12 '19 at 20:16

1 Answers1

1

Doing combinations calculations of this sort is not quick. I'm timing it using time.time.

starttime = time.time()

We will use combintations from itertools.

from itertools import combinations

First I will recreate your data (please include next time for responders so we don't have to recreate). I'm using letters for names. I will use pd.Series as this fits your data and this will be quickest next to using numpy directly.

s = pd.Series(np.random.randint(8,11,22), index = list('ABCDEFGHIJKLMNOPQRSTUV'))

Next set the combination parameters.

# set combination parameters
r =  11
n = len(s)

Create a list for tracking the names whose scores == 100.

name_list = []   

Finally, loop through the combinations storing all combinations summing to 100.

for p in combinations(s.index, r):
    if s.loc[list(p)].sum() == 100:
        name_list.append(p)

endtime = time.time()



print('Time to run', endtime - starttime)
Time to run 350.4899344444275

name_list (fist ten rows)

[('A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'K', 'M'),
 ('A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'K', 'T'),
 ('A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'K', 'V'),
 ('A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'M', 'T'),
 ('A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'M', 'V'),
 ('A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'T', 'V'),
 ('A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'K', 'L', 'M'),
 ('A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'K', 'L', 'T'),
 ('A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'K', 'L', 'V'),
 ('A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'K', 'M', 'O'),...

How many combinations are there?

print(len(name_list))
86724
run-out
  • 3,114
  • 1
  • 9
  • 25