0

The data I'm working with (https://mushroom.mathematik.uni-marburg.de/files/PrimaryData/primary_data_edited.csv) has lists in some entries and I'd like to expand these lists as a cartesian product of their elements. The image below is two columns in the dataframe's header.

data sample

The output I'd like is something like:

cap-diameter | cap-shape
10 | x
10 | f
20 | x
20 | f
5 | p
5 | x
10 | p
10 | x
...

So I don't want the cartesian product of all of the entries in each column, just that of the respective rows. I think pd.explode() might be a good place to start but I'm not sure how to accomplish this. Thanks in advance.

Susemiehlian
  • 25
  • 1
  • 6
  • Does this answer your question? [Efficient way to unnest (explode) multiple list columns in a pandas DataFrame](https://stackoverflow.com/questions/45846765/efficient-way-to-unnest-explode-multiple-list-columns-in-a-pandas-dataframe) – Ynjxsjmh Apr 06 '22 at 03:03

1 Answers1

2

Explode each column consecutively

df = df.explode('cap-diameter')
df = df.explode('cap-shape')
im_vutu
  • 412
  • 2
  • 9