python pandas can't explode columns and getting this error columns must have matching element counts

Question

I have datafremae like this

        product_title   variation type  product_price       attribute
    0   Chauvet DJ      [Black, White]  [899.99, 949.99]    ['<p>apple,banana<p>', '<p>yewllow,orange,blue</p>']

my expected dataframe will be look like this

product_title  variation type        product_price            attribute
Chauvet DJ     Black                  899.99               <p>apple,banana<p>
Chauvet DJ     White                  949.99               <p>yewllow,orange,blue</p>

I tried this code:

        data["variation type"] = data["variation type"].apply(str).str.strip('[]').apply(str).str.replace("'","").apply(str).str.split(',')
        data["product_price"] = data["product_price"].apply(str).str.strip('[]').apply(str).str.replace("'","").apply(str).str.split(',')
        data["attribute"] = data["attribute"].apply(str).str.strip('[]').apply(str).str.replace("'","").apply(str).str.split(r",(?=')",expand=True)
        data = data.explode(['variation type', 'product_price','attribute'])

getting this error:

ValueError: columns must have matching element counts

ArchAngelPwn · Answer 1 · 2022-05-31T12:43:33.143

0

This should be what you need to get the results you expect:

data = {
    'product_title' : ['Chauvet DJ '],
    'variation_type' : [['Black', 'White']],
    'product_price' : [[899.99, 949.99]],
    'attribute' : ['[<p>apple,banana<p>, <p>yewllow,orange,blue</p>]']
}

df = pd.DataFrame(data)
df['attribute'] = df['attribute'].apply(lambda x : x.replace('[','')).apply(lambda x : x.replace(']','')).apply(lambda x : x.split(', ', 1))
df.set_index(['product_title']).apply(pd.Series.explode).reset_index()

You can read more about that here: Efficient way to unnest (explode) multiple list columns in a pandas DataFrame

edited May 31 '22 at 12:43

answered May 31 '22 at 00:56

ArchAngelPwn

2,891
1
4
17

ArchAngelPwn I tried your code in my actual datframe but not working – boyenec May 31 '22 at 01:04
Are you sure that your 'lists' are actually list? Could they simply be strings pretending to be lists? – ArchAngelPwn May 31 '22 at 01:08
they are actual list except few columns – boyenec May 31 '22 at 01:14
Ok this technique would require them to actually be lists (of equal size) to work. Would you be able to turn those that aren't into a list and try again? If not can you update your original post to have which columns are not lists and I'll work with it to help make them into one? – ArchAngelPwn May 31 '22 at 01:18
if you see my attribute column here you can see `['
apple,banana
', '
yewllow,orange,blue
']` I have two item in my list but the problem as I use split by comma so it's aslo counting the comma inside my inverted quote `'
yewllow,orange,blue
']` . We need to only count comma after the inverted quote. – boyenec May 31 '22 at 11:16
I updated the code to reflect if the attribute column is a string of a list – ArchAngelPwn May 31 '22 at 12:44

python pandas can't explode columns and getting this error columns must have matching element counts

1 Answers1