-1

I have a Pandas DataFrame that looks like the following:

    ID           Date      Element  Temperature (C)
0   USW00094889 2014-11-12  TMAX    2.2
1   USC00208972 2009-04-29  TMIN    5.6
2   USC00200032 2008-05-26  TMAX    27.8
3   USC00205563 2005-11-11  TMAX    13.9
4   USC00200230 2014-02-27  TMAX    -10.6

I've been staring at the documentation for the stack and unstack functions and related functions like melt and pivot_table but I can't quite seem to figure out the combination of these that I need to do what I want. I'm trying to get rid of the "Element" column and instead create two new columns, "TMAX" and "TMIN". This would make each row a unique observation. Each Date/ID combo has a TMAX and TMIN. So theoretically this new dataframe would be half the height of the current one.

John Jacoby
  • 23
  • 1
  • 5
  • HI, welcome to StackOverflow. What have you tried so far? What was the result? Can you link to some other SO posts that address, but do not solve, this problem? – Evan Dec 03 '18 at 21:19
  • I would recommend including your desired output in your post as well. – rahlf23 Dec 03 '18 at 21:20

1 Answers1

0

Docs: https://pandas.pydata.org/pandas-docs/stable/reshaping.html

Create the DF and create the desired multi-index.

"""
ID,Date,Element,Temperature (C)
USW00094889,2014-11-12,TMAX,2.2
USC00208972,2009-04-29,TMIN,5.6
USC00200032,2008-05-26,TMAX,27.8
USC00205563,2005-11-11,TMAX,13.9
USC00200230,2014-02-27,TMAX,-10.6
"""
import pandas as pd
df = pd.read_clipboard(sep=",").set_index(["ID", "Date"])
print(df)

Output:

                       Element  Temperature (C)
ID          Date                               
USW00094889 2014-11-12    TMAX              2.2
USC00208972 2009-04-29    TMIN              5.6
USC00200032 2008-05-26    TMAX             27.8
USC00205563 2005-11-11    TMAX             13.9
USC00200230 2014-02-27    TMAX            -10.6

df.pivot(index=df.index, columns="Element")

Output:

                       Temperature (C)     
Element                           TMAX TMIN
ID          Date                           
USC00200032 2008-05-26            27.8  NaN
USC00200230 2014-02-27           -10.6  NaN
USC00205563 2005-11-11            13.9  NaN
USC00208972 2009-04-29             NaN  5.6
USW00094889 2014-11-12             2.2  NaN

Note that I dropped the numerical ID from your sample df.

Evan
  • 2,121
  • 14
  • 27