0

I have datafremae like this

        product_title   variation type  product_price       attribute
    0   Chauvet DJ      [Black, White]  [899.99, 949.99]    ['<p>apple,banana<p>', '<p>yewllow,orange,blue</p>']

my expected dataframe will be look like this

product_title  variation type        product_price            attribute
Chauvet DJ     Black                  899.99               <p>apple,banana<p>
Chauvet DJ     White                  949.99               <p>yewllow,orange,blue</p>

I tried this code:

        data["variation type"] = data["variation type"].apply(str).str.strip('[]').apply(str).str.replace("'","").apply(str).str.split(',')
        data["product_price"] = data["product_price"].apply(str).str.strip('[]').apply(str).str.replace("'","").apply(str).str.split(',')
        data["attribute"] = data["attribute"].apply(str).str.strip('[]').apply(str).str.replace("'","").apply(str).str.split(r",(?=')",expand=True)
        data = data.explode(['variation type', 'product_price','attribute'])

getting this error:

ValueError: columns must have matching element counts

boyenec
  • 1,405
  • 5
  • 29

1 Answers1

0

This should be what you need to get the results you expect:

data = {
    'product_title' : ['Chauvet DJ '],
    'variation_type' : [['Black', 'White']],
    'product_price' : [[899.99, 949.99]],
    'attribute' : ['[<p>apple,banana<p>, <p>yewllow,orange,blue</p>]']
}

df = pd.DataFrame(data)
df['attribute'] = df['attribute'].apply(lambda x : x.replace('[','')).apply(lambda x : x.replace(']','')).apply(lambda x : x.split(', ', 1))
df.set_index(['product_title']).apply(pd.Series.explode).reset_index()

You can read more about that here: Efficient way to unnest (explode) multiple list columns in a pandas DataFrame

ArchAngelPwn
  • 2,891
  • 1
  • 4
  • 17
  • ArchAngelPwn I tried your code in my actual datframe but not working – boyenec May 31 '22 at 01:04
  • Are you sure that your 'lists' are actually list? Could they simply be strings pretending to be lists? – ArchAngelPwn May 31 '22 at 01:08
  • they are actual list except few columns – boyenec May 31 '22 at 01:14
  • Ok this technique would require them to actually be lists (of equal size) to work. Would you be able to turn those that aren't into a list and try again? If not can you update your original post to have which columns are not lists and I'll work with it to help make them into one? – ArchAngelPwn May 31 '22 at 01:18
  • if you see my attribute column here you can see `['

    apple,banana

    ', '

    yewllow,orange,blue

    ']` I have two item in my list but the problem as I use split by comma so it's aslo counting the comma inside my inverted quote `'

    yewllow,orange,blue

    ']` . We need to only count comma after the inverted quote.
    – boyenec May 31 '22 at 11:16
  • I updated the code to reflect if the attribute column is a string of a list – ArchAngelPwn May 31 '22 at 12:44