2

I have a dataset which I need to unpivot into multiple rows into columns.

Eg:

ID  Currency    Val1    Val2        Month
101 INR     57007037.32 1292025.24  2021-03
101 INR     49171143.9  1303785.98  2021-02
101 INR     54039073.81 1469727.23  2021-01
101 INR     67733998.9  1370086.78  2020-12
101 INR     48838409.39 1203648.32  2020-11
101 INR     43119693.71 0           2020-10

I need to pivot / unpivot them as below.

ID  Currency    Keys        2021-03     2021-02         2021-01         2020-12     2020-11     2020-10
101 INR         Val1    57007037.32     49171143.9      54039073.81     67733998.9  48838409.39     4311.71
101 INR         Val2    1292025.24      1303785.98      1469727.23      1370086.78  1203648.32      0

I have tried df.melt but that is not the giving exact output.

df = pd.read_csv('/path/abc.csv')
print(df.melt(id_vars=['id'], var_name=['month'], value_vars=['val1','val2']))

Please help on this.

Shubham Sharma
  • 68,127
  • 6
  • 24
  • 53
Rummy
  • 51
  • 8
  • 4
    Please don't post images. Edit your question and add your code/dataframes in the text format so that we could be able to copy these while trying to solve your question. – Shubham Sharma Apr 25 '21 at 07:35
  • @ShubhamSharma, Edited my question. Please on this request. – Rummy Apr 25 '21 at 07:41

4 Answers4

4

1. stack and unstack

(df.set_index(['ID', 'Currency', 'Month']).stack()
  .unstack(-2).reset_index().rename(columns={'level_2': 'keys'}))

2. Melt and pivot

df.melt(['ID', "Currency", 'Month'], var_name='keys')\
  .pivot(['ID', 'Currency', 'keys'], 'Month', 'value').reset_index()

Month   ID Currency  keys      2020-10      2020-11      2020-12      2021-01      2021-02      2021-03
0      101      INR  Val1  43119693.71  48838409.39  67733998.90  54039073.81  49171143.90  57007037.32
1      101      INR  Val2         0.00   1203648.32   1370086.78   1469727.23   1303785.98   1292025.24
Shubham Sharma
  • 68,127
  • 6
  • 24
  • 53
  • ....your solution looks elegant....any article/link you can suggest where i can have better understanding of stack/unstack and Melt/pivot apart from the documentation..Thanks for any lead... – Deepak Apr 25 '21 at 08:00
  • Thank you @Deepak i suggest you to check the [official pandas docs of `reshaping`](https://pandas.pydata.org/docs/user_guide/reshaping.htmla) the dataframe which is thoroughly written and will definitely help you understand reshaping the dataframe better. – Shubham Sharma Apr 25 '21 at 08:04
1

Another version using .pivot:

print(
    pd.pivot(
        df,
        index=["ID", "Currency"],
        columns="Month",
        values=["Val1", "Val2"],
    )
    .stack(level=0)
    .reset_index()
    .rename(columns={"level_2": "keys"})
)

Prints:

Month   ID Currency  keys      2020-10      2020-11      2020-12      2021-01      2021-02      2021-03
0      101      INR  Val1  43119693.71  48838409.39  67733998.90  54039073.81  49171143.90  57007037.32
1      101      INR  Val2         0.00   1203648.32   1370086.78   1469727.23   1303785.98   1292025.24

EDIT: Solution that works in pandas=0.22.0:

print(
    df.set_index(["ID", "Currency"])
    .pivot(columns="Month")
    .stack(level=0)
    .reset_index()
    .rename(columns={"level_2": "keys"})
)

Prints:

Month   ID Currency  keys      2020-10      2020-11      2020-12      2021-01      2021-02      2021-03
0      101      INR  Val1  43119693.71  48838409.39  67733998.90  54039073.81  49171143.90  57007037.32
1      101      INR  Val2         0.00   1203648.32   1370086.78   1469727.23   1303785.98   1292025.24

EDIT: To sort columns:

x = (
    df.set_index(["ID", "Currency"])
    .pivot(columns="Month")
    .stack(level=0)
    .reset_index()
    .rename(columns={"level_2": "keys"})
)
x.columns.name = ""
x = x[["ID", "Currency", "keys"] + sorted(x.columns, reverse=True)[3:]]
print(x)

Prints:

    ID Currency  keys      2021-03      2021-02      2021-01      2020-12      2020-11      2020-10
0  101      INR  Val1  57007037.32  49171143.90  54039073.81  67733998.90  48838409.39  43119693.71
1  101      INR  Val2   1292025.24   1303785.98   1469727.23   1370086.78   1203648.32         0.00
Andrej Kesely
  • 168,389
  • 15
  • 48
  • 91
  • I tried yours but getting error - TypeError: pivot_simple() got multiple values for keyword argument 'index;. – Rummy Apr 25 '21 at 08:15
  • I'm using pandas 0.22.0 version. – Rummy Apr 25 '21 at 08:22
  • @Rummy That's really historical version. Can you upgrade it? – Andrej Kesely Apr 25 '21 at 08:23
  • My organization using this one so i can't able to upgrade. Please let me know some other ways. – Rummy Apr 25 '21 at 08:42
  • @Rummy I specifically installed `pandas=0.22.0`. See my edit for working solution. But I strongly advise to update the pandas version. – Andrej Kesely Apr 25 '21 at 08:53
  • Thank You So Much !! It's working but I need to display month in desc order i.e 2021-03,2021-02,2021-01,2020-12... And is there any possibility to hide first column i.2 Month – Rummy Apr 25 '21 at 09:11
  • Perfect, Thank You So Much !! How to write output data into csv or excel ?? – Rummy Apr 25 '21 at 09:32
  • @Rummy `x.to_csv("filename.csv")` – Andrej Kesely Apr 25 '21 at 10:03
  • @ Andrej Kesely, can you please help on this. https://stackoverflow.com/questions/66530497/how-to-read-data-from-multiple-csv-files-and-write-into-same-sheet-of-single-exc – Rummy Apr 25 '21 at 15:16
  • @Rummy It's better to open a new question than edit an old one (to prevent confusion). Revert the edit, create a new question and I'll try to look at it. – Andrej Kesely Apr 25 '21 at 15:18
  • @ Andrej Kesely, I tried to open but it is not allowed. Please help !! – Rummy Apr 25 '21 at 15:29
1

Another method:

>>> df.groupby(["ID", "Currency"]) \
      .apply(lambda x: x[["Month", "Val1", "Val2"]] \
                       .set_index("Month") \
                       .rename_axis(columns="Key") \
                       .transpose())

Month                  2021-03      2021-02      2021-01      2020-12      2020-11      2020-10
ID  Currency Key
101 INR      Val1  57007037.32  49171143.90  54039073.81  67733998.90  48838409.39  43119693.71
             Val2   1292025.24   1303785.98   1469727.23   1370086.78   1203648.32         0.00

For Pandas version prior to 0.24, change rename_axis(columns="Key") to rename_axis("Key", axis="columns")

Corralien
  • 109,409
  • 8
  • 28
  • 52
  • I tried your answer but getting error - TypeError: rename_axis() got an unexpected keyword argument 'columns'. – Rummy Apr 25 '21 at 08:13
  • I updated my answer to take into account your older version. As @AndrejKesely said, you have to update your version. – Corralien Apr 25 '21 at 10:48
  • ,can you please look into below one. https://stackoverflow.com/questions/66530497/how-to-read-data-from-multiple-csv-files-and-write-into-same-sheet-of-single-exc – Rummy Apr 25 '21 at 15:08
1

using set_index + unstack + stack:

(df.set_index(['ID', 'Month', 'Currency']).unstack('Month')
.stack(level=0).rename_axis(index={None: 'Keys'}).reset_index())

Month ID    Currency  Keys  2020-10     2020-11     2020-12     2021-01     2021-02     2021-03
0     101   INR       Val1  43119693.71 48838409.39 67733998.90 54039073.81 49171143.90 57007037.32
1     101   INR       Val2  0.00        1203648.32  1370086.78  1469727.23  1303785.98  1292025.24
Pygirl
  • 12,969
  • 5
  • 30
  • 43