I'm working with the association_rules from mlxtend package in python.
The resulting output is in a df and the antecedants/consequents columns can have a number of results separated by commas.
Here is the code used to generate the output:
rules = association_rules(frequent_items, metric='lift',min_threshold=0.5)
this results with the following headers:
antecedants
consequents
antecedent support
consequent support
support
confidence
lift
leverage
conviction
So with the column with the antecedants (also with the consequents) can have more than one result in the column as seen below:
Antecedants
(SKU1, SKU2, SKU3)
(SKU1, SKU2)
(SKU1)
(SKU1, SKU2, SKU3, SKU4)
However, I want to delimit based on the commas and extended the existing df with more columns
Desired output:
antecedants antecedants2 antecedants3 antecedants4
SKU1 SKU2 SKU3
SKU1 SKU2
SKU1
SKU1 SKU2 SKU3 SKU4
I've tried using this line of code -
rules['antecedants'].str.split(',', expand=True)
- but I get a bunch of NaNs as a result.
Any help or guidance would be appreciated - new to Python.