I am trying to run an apriori analysis on a series of hashtags scraped from Twitter in python using jupyter lab, and need to find a way to time out a function after a certain period of time. The function is being run using a while loop that incrementally reduces the size of the support value, and stops after ten seconds has passed.
def association_rules(hashtags_list):
# Convert the list of hashtags into a list of transactions
transactions = [hashtags for hashtags in hashtags_list]
# Initialize the support
min_support = 1
# Initialize the confidence
min_confidence = 0.1
# Initialize the lowest support
lowest_support = 1
# Start the timer
start_time = time.time()
while True:
try:
# Find the association rules
association_rules = apriori(transactions, min_confidence=min_confidence, min_support=min_support)
# Convert the association rules into a list
association_rules_list = list(association_rules)
# End the timer
end_time = time.time()
# Calculate the running time
running_time = end_time - start_time
# check if running time is over the maximum time
if running_time >= 10:
break
lowest_support = min_support
if min_support > 0.01:
min_support = min_support - 0.01
else:
min_support = min_support - 0.005
if min_support <= 0:
min_support = 0.01
except Exception as e:
print("An error occurred:", e)
break
return association_rules_list, round(lowest_support, 3)
The problem this causes is because the timeout is called within the loop itself, it is possible for the loop to get hung up if the apriori support value gets too low before hitting the 10 seconds, which often happens with small datasets, so I need an external function to stop the loop.
I've been looking into parallel processing with no success, and still can't really determine if it can even be carried out in Jupyter Lab.
Any ideas on how to stop a function would be appreciated.
Edited to add that I am running on Win 10, which may effect some options.