I've joined 3 Excel tab data sets to give me my base dataframe, and then for each line I want to count the int values in the comma separated values in DUAlloc and divide Amount by the DUAlloc Count then loop through the DuAlloc list and assign individual lines e.g.
Base Data:
Description | DuAlloc | Amount |
---|---|---|
Blah | 1,2,3,4,5 | 1000 |
Yada | 30,15,3,4,5 | 200 |
Processed Data:
Description | DuAlloc | Amount |
---|---|---|
Blah | 1 | 200 |
Blah | 2 | 200 |
Blah | 3 | 200 |
Yada | 3 | 40 |
Blah | 4 | 200 |
Yada | 4 | 40 |
Blah | 5 | 200 |
Yada | 5 | 40 |
Yada | 15 | 40 |
Yada | 30 | 40 |
I've tried numerous ways to convert to a list: list(), tolist(), but either get the same number for all the counts, or the nearest I've come is [len(str(c)) for c in df3['DUAlloc']]
which counts all the characters which I don't want.
How would I go about achieving this, and is Pandas the best route to take?