pandas how to split one row to multiple row for variable product?

Question

I have multiple variable product in my csv. Assume I have an product which title "Car model145" and this "Car model145" have three different price and size. Now I want to expand price and color row with title. here is my data frame:

     title             price                       color                  image

  0  Car model145      2,54.00,852.00,2532.00      black,white,blue        car iamge url 
                       #three different price

I also have problem in price column. how to remove first comma after 2? so I can split price row properly. I also don't want to expand image row. The result will be look like this:

  title             price                       color                  image
0  Car model145      254.00                     black               car iamge url 
1  Car model145      852.00                     white  
2  Car model145      2532.00                    blue

You can explore this pandas functionality [df.explode()](https://stackoverflow.com/questions/12680754/split-explode-pandas-dataframe-string-entry-to-separate-rows) — Agnij, Sep 29 '21 at 12:58
Agnij I applied df.explode() but title row is not expanding properly and also I have problems in price column because I can't remove comma after two. `2,45.00` — boyenec, Sep 29 '21 at 13:00
Is the comma issue a recurring one across the whole column (that too in the same pattern)? if not then manual removal can be considered. — Agnij, Sep 29 '21 at 13:05
my every price row like this `2,54.00,852.00,2532.00` there have an comma after every first number and I want to remove comma after every first number — boyenec, Sep 29 '21 at 13:14

mozway · Accepted Answer · 2021-09-29T13:43:48.130

2

Something confusing is the extra price (2,). Do you have this for all prices? You first need to get rid of it.

Then you can simply apply str.split and explode:

(df.assign(price=df['price'].str.replace(',', '', 1)) # remove first comma
   .apply(lambda s: s.str.split(',').explode())
   .assign(image=lambda d: d['image'].mask(d['image'].duplicated(), ''))
   .reset_index(drop=True)
 #  .to_csv('filename.csv')  # uncomment to save output as csv
)

output:

          title    price  color          image
0  Car model145   254.00  black  car iamge url
1  Car model145   852.00  white               
2  Car model145  2532.00   blue

edited Sep 29 '21 at 13:43

answered Sep 29 '21 at 13:18

mozway

194,879
13
39
75

mozway Thanks. I see the result in my jupyter notebook but getting the first csv result when export the csv file. Should I put `inpalce = True` anywhere? – boyenec Sep 29 '21 at 13:38
mozway I am trying to export csv `data.to_csv('my_path/test1.csv')` but not getting the terminal result in csv. – boyenec Sep 29 '21 at 13:40
add `.to_csv('filename.csv')` before the last `)` (see update) – mozway Sep 29 '21 at 13:43
mozway can you please little bit explain about .mask functionality? what basically .mask doing here? – boyenec Sep 29 '21 at 13:46
You can check the doc for [`mask`](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.mask.html), in summary, it replaces the matched rows with another value (here the empty string `''`) – mozway Sep 29 '21 at 13:55

pandas how to split one row to multiple row for variable product?

1 Answers1