0

I got a dataframe like the below:

| DATETIME STR     | VALUE |
|------------------|-------|
| 23/04/2019 05:00 | 500   |
| 23/04/2019 05:00 | 300   |
| 23/04/2019 05:00 | 150   |
| ................ | ...   |
| 23/04/2019 23:55 | ...   |

I want to go through the dataframe once and add a new column that has the percentage of each cell of the total one. The total one is different for different times.

For example the above percentages for the 23/04/2019 05:00 would be:

| DATETIME STR     | VALUE | PERCENTAGE |
|------------------|-------|------------|
| 23/04/2019 05:00 | 500   | 52.63 %    |
| 23/04/2019 05:00 | 300   | 31.58 %    |
| 23/04/2019 05:00 | 150   | 15.79 %    |
| 23/04/2019 10:00 | 600   | ..... %    |
| ................ | ...   | ..... %    |
| 23/04/2019 23:55 | ...   | ..... %    |

How can I do this?

I think I am looking at something like:

dataframe["PERCENTAGE"] = (100 * dataframe["VALUE"])/sum_of_same_date

I can't figure how to calculate the sum_of_same_date for each cell that is getting filled at the same time that I am going through the dataframe filling the new column.

Any help would be appreciated.

Lev
  • 673
  • 1
  • 12
  • 29
  • Possible duplicate of [Pandas percentage of total with groupby](https://stackoverflow.com/questions/23377108/pandas-percentage-of-total-with-groupby) – Erfan Apr 24 '19 at 14:00
  • You can't do this in one pass. For each date, you need to see all values for that date then go back and fill the percentages in. There are fast ways to do it and it will still be linear time but technically speaking you can't do it in one pass. Both of the provided answers as of the time of this comment do this. – piRSquared Apr 24 '19 at 14:00

1 Answers1

1

Try with transform

dataframe["PERCENTAGE"] = (100 * dataframe["VALUE"])/dataframe.groupby('DATETIME STR')['VALUE'].transform('sum')
BENY
  • 317,841
  • 20
  • 164
  • 234
  • Wow! Thanks. Did exactly what I needed. Gonna read some about it, since I wasn't familiar with transform. – Lev Apr 24 '19 at 14:54