-4

if u see the screenshot in that f4,f5, and f9 columns values are separated by commas, i want to split that values into different rows, and f9 is a total number of products, so I need to split the revenue as well based on quantity, for example total number of products according to f9 is 5, so total revenue is 211.32, so each row it should be (211.32/5)*2(quantity value)

enter image description here

I tried this:

df3['Product_Name']=df3.Product_Name.str.split(",").explode('Product_Name')
df3['Product_ID']=df3.Product_ID.str.split(",").explode('Product_ID')
df3['Number_of_Products']=df3.Number_of_Products.str.split(",").explode('Number_of_Products')

but didn't work..

Laurent LAPORTE
  • 21,958
  • 6
  • 58
  • 103
  • What have you tried and why is it not working? – thornejosh Apr 02 '21 at 19:19
  • Welcome to Stack Overflow. Please read how to ask good [questions](https://stackoverflow.com/help/how-to-ask). Make sure your question covers these 3 elements: 1. Problem Statement 2. Your Code (it should be [Minimal, Reproducible Example](https://stackoverflow.com/help/minimal-reproducible-example) 3. Error Message (preferably full Traceback to help others review and provide feedback). Find out if it is a [duplicate](https://stackoverflow.com/help/duplicates). Please read [how to share pandas examples](https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples) – Joe Ferndz Apr 02 '21 at 19:26
  • Please do not share images. Instead share as regular pandas example. See the link above for more details – Joe Ferndz Apr 02 '21 at 19:27
  • This is already addressed here: https://stackoverflow.com/questions/64814532/splitting-strings-in-a-column-into-multiple-rows I would vote to close this question. – Wiliam Apr 02 '21 at 19:43

1 Answers1

0

You have to develop a specific function to do this (with or without pandas).

Let's consider you have a classic Python list:

row = [
    "10/4/2017 3:31",  # f1
    112,  # f2
    "220.239.97",  # f3
    "x,y,z",  # f4
    "x1,y1,z1",  # f5
    "0",  # f6
    "female",  # f7
    "cc@braint",  # f8
    "2,2,1",  # f9
    "East Lindf",  # f10
    "NSW",  # f11
    211.32,  # Revenue
]

you can create a small function which reads the fields and "split" the columns. Here is an example:

def split_columns(r):
    f4 = r[3].split(",")
    f5 = r[4].split(",")
    f9 = list(map(int, r[8].split(",")))
    revenue = r[11]
    amount = revenue / sum(f9)
    for v4, v5, v9 in zip(f4, f5, f9):
        r[3] = v4
        r[4] = v5
        r[8] = v9
        r[11] = amount * v9
        yield r

This generator can be used like this:

for line in split_columns(row):
    print(line)

You get:

['10/4/2017 3:31', 112, '220.239.97', 'x', 'x1', '0', 'female', 'cc@braint', 2, 'East Lindf', 'NSW', 84.52799999999999]
['10/4/2017 3:31', 112, '220.239.97', 'y', 'y1', '0', 'female', 'cc@braint', 2, 'East Lindf', 'NSW', 84.52799999999999]
['10/4/2017 3:31', 112, '220.239.97', 'z', 'z1', '0', 'female', 'cc@braint', 1, 'East Lindf', 'NSW', 42.263999999999996]
Laurent LAPORTE
  • 21,958
  • 6
  • 58
  • 103