0

I have a dataframe (called df) with id's and transactions. Each row represents a single transaction. The transaction column is a column of sets, so for transaction 1 the values might be {a,b,c} and transaction 2 could be {a,d,e,b,f}. I have another list of sets (called set_list) of unique sets. I'm trying to get the count for the number of times each set in set_list is a subset of a transaction is my dataframe (df). I can iterate through the df using issubset to see if the set_list value is a subset, but I'm having trouble getting the count for each time it is a subset. I can add each subset to a list, but that doesn't give me the count for each time it's in transactions. Any thoughts? Here's what I have so far.

final = {}
for item in set_list:
    for i in df.transactions:
        if item.issubset(i):
           
Golffan27
  • 11
  • 2

1 Answers1

0

here's my approach based on [a related question][1]:

set_list = [{"a", "b", "c"},{"a", "d", "e", "b", "f"}]

u = set.intersection(*set_list)
print(u)

Output: {'a', 'b'} [1]: Best way to find the intersection of multiple sets?

  • This isn't quite what I'm looking for. Let's use the union here {'a,',b'}. What I need to do is iterate through set_list and see how many of the sets contain {'a,',b'}. I would need to repeat this process for many other sets such as {'a'} and {'b','c'}. I would need to return a dictionary with each set and the count of that set in set_list – Golffan27 Apr 12 '21 at 23:54