1

This is my main dataframe:

              DB_FULL_DATE               NAME    PRODS    CONS
0      2020-01-01 00:00:00                pp1   0.0000  0.0378
1      2020-01-01 00:00:00                pp2   0.0000  0.0170
2      2020-01-01 00:00:00                pp3   0.0000  0.0240
3      2020-01-01 00:00:00                pp4   0.0000  0.0400
4      2020-01-01 00:00:00                pp5   0.5294  0.0016


276739 2020-08-31 23:00:00               pp60  47.5700  0.0000
276740 2020-08-31 23:00:00               pp61  40.4700  0.0000
276741 2020-08-31 23:00:00               pp62  29.4600  0.0000

I am trying to split my main dataframe and change its structure. I want to create two dataframes which are the same structure. First dataframe has to include 'PRODS' values and second one has to include 'CONS' values. I will only show example of PRODS dataframe. If there is any solution, I will apply again for CONS dataframe.

This is the example of the PRODS dataframe:


              DB_FULL_DATE                pp1      pp2     pp3   . . .  pp62
0      2020-01-01 00:00:00             0.0000   0.0000  0.0000            .
1      2020-01-01 01:00:00             0.0113   0.1872  0.2819            .
2      2020-01-01 02:00:00             0.0218   0.5632  0.3542            .
3      2020-01-01 03:00:00             0.4560   0.8274  0.2576            .
4      2020-01-01 04:00:00             0.9657   0.1726  0.3491            .


276739 2020-08-31 21:00:00            11.1100  33.5400 18.1400            .
276740 2020-08-31 22:00:00            43.6800  40.4900 16.0400            .
276741 2020-08-31 23:00:00            52.3500  24.3600 13.0600            .

To summarize, I want to get dataframe that is like my example. How can I have this structure? Is this possible?

JuniorESE
  • 267
  • 1
  • 7

1 Answers1

1

What you are looking for is pandas' pivot function. You read more about it in the documentation. Or check the following: How to pivot a dataframe?

You may want to try it yourself before looking at the solution:

For the following small example:

import pandas as pd
df = pd.DataFrame({"Name": ["pp1", "pp2", "pp1", "pp2"], "PRODS":[2.5, 1.2,6.3,2.6]}, index=[1,1,2,2])

which yields the following dataframe:

    Name    PRODS
1    pp1    2.5
1    pp2    1.2
2    pp1    6.3
2    pp2    2.6

You can pivot with "Name" as variable and "PRODS" and values:

prod_df = df.pivot(columns="Name", values="PRODS")

restults:

Name    pp1 pp2
1       2.5 1.2
2       6.3 2.6

(you can reindex as you wish, or just add index="DB_FULL_DATE" to pivot to make sure it uses DB_FULL_DATE as index for new df)

Roim
  • 2,986
  • 2
  • 10
  • 25