I am not sure what input you exactly want, because it is somehow unclear how the lists you posted are fitting the definition of input for an Apriori algorithm.
The input should be a list of transactions, an item within these transactions and a number which represents the count of certain items that come together with the specified item within the same transaction.
The output is a list of items which have been sold together with the specified item the wanted number of times.
There is a couple of libraries for this kind of problem. The user null already pointed a good one out: https://github.com/tommyod/Efficient-Apriori. There is also Apyori: https://github.com/ymoch/apyori.
Here is a simple attempt of solving the Apriori algorithm. It can be copied to a file and executed with Python:
# list of transactions
sales = [
('eggs', 'bacon', 'soup'),
('eggs', 'bacon', 'apple'),
('soup', 'bacon', 'banana'),
]
# generate match dictionary of type {item: {count: {item, ...}, ...}, ...}
matches = {
i: {
sum((i in z and j in z) for z in sales): set(
k for t in sales for k in t
if i!=k and
sum((i in z and j in z) for z in sales) == sum((i in z and k in z) for z in sales)
)
for t in sales for j in t if i in t and j!=i
}
for t in sales for i in t
}
#print ( "match counts: %s\n" % (matches) )
print ( "best match(es) for eggs:", matches['eggs'][len(matches['eggs'])] )
# output: {'bacon'}
print ( "best match(es) for bacon:", matches['bacon'][len(matches['bacon'])] )
# output: {'eggs', 'soup'}
basket = ('soup', 'apple', 'banana') # consumer basket
# calculate a list of best matches for new sales
best = set(sum([ list(matches[i][len(matches[i])]) for i in basket ], [])) - set(basket)
print ( "basket: %s, best matches: %s" % ( basket, best ) )
# output: {'bacon', 'eggs'}
The above code generates a dictionary of items with a list of certain counts for certain items within transactions which contain both items. The generation of this dictionary might be slow for huge transaction lists. But you don't have to calculate this for every new transaction. Instead I would recalculate the match count every now and then per day.
The item names can be replaced with item indices to address item datasets. In this example the strings are more clear than just numbers.
In general turning slow functions to nested dictionaries of datasets is a good idea to speed up code. A slow function of type:
result = function ( parameter, parameter, ... )
Can be turned to a nested dictionary and a function that recalculates the dictionary after a longer period of time:
if time < refresh:
dictionary = precalc ( )
refresh = time + rate
...
result = dictionary [ parameter ] [ parameter ] [ ... ]
This solution requires more memory of course.
In order to get solid answers you should not downvote posts but instead provide a bigger chunk of code that can be copied to a file and executes. You also should offer clear input values of your Function. What is Lk
and what is k
?
Accordingly to your question I assumed the following program which does not output the error you posted:
import itertools as it
def generateItemsets(Lk,k):
comb = sum(Lk.keys(), tuple())
Ck = set(it.combinations(comb, k))
return Ck
# input of apriori algorithm should be a list of transactions, wtf is this ?!
Lk = {(150,): 2, (160,): 3, (170,): 3, (180,): 3}
missing_input_value = 1234567890
print ( generateItemsets ( Lk, missing_input_value ) )
# output: set()
for i in range(0,999999):
generateItemsets ( Lk, i ) # does not error out
So you either messed up your Python version or I misunderstood your question or the input you provided does not cover the faulty situation of your program.
I would recommend you update your question with a bigger piece of code and not just three lines of function without any working input.
When you are using a Jupyter notebook, the error you get might have to do something with your output data rate. Try executing jupyter notebook --NotebookApp.iopub_data_rate_limit=1.0e10
in console, which is from this post: How to solve "IOPub data rate exceeded." in Jupyter Notebook
or this video: https://www.youtube.com/watch?v=B_YlLf6fa5A