0

I have the following Dataframe:

ORD_DATE   Weekday   ALL Sales       ALL Pred          Other Sales   Other Pred  
1/1/2020   2        48386.27        20815269.83        15386.27       5643158.509

I would have approx. 1000 different Sale/Pred combos. I want it this way:

ORD_DATE  Weekday   Sales            Pred          Filter  
1/1/2020   2        48386.27        20815269.83    All  
1/1/2020   2        15386.27        5643158.509    Other

I have tried Transpose but did not get the results I wanted.

JNallen
  • 19
  • 2
  • Try to create a [minimum reproducible example](https://stackoverflow.com/help/minimal-reproducible-example) – Karthik V Apr 29 '20 at 23:36
  • 2
    Have you look into this question: https://stackoverflow.com/questions/28654047/pandas-convert-some-columns-into-rows ? – Krerkkiat Chusap Apr 29 '20 at 23:47
  • Does this answer your question? [pandas convert some columns into rows](https://stackoverflow.com/questions/28654047/pandas-convert-some-columns-into-rows) – Tim Apr 30 '20 at 03:21

3 Answers3

3

To do this, first move all unaltered columns into the index, then split the remaining column names and create a MultiIndex out of their components, and finally stack one of the column levels.

df = pd.DataFrame.from_records(
    [
        {
            "ORD_DATE": "1/1/2020",
            "Weekday": 2,
            "ALL_Sales": 48386.27,
            "ALL_Pred": 20815269.83,
            "Other_Sales": 15386.27,
            "Other_Pred": 5643158.509,
        }
    ]
)

df = df.set_index(["ORD_DATE", "Weekday"])
split_cols = [c.split("_") for c in df.columns]
df.columns = pd.MultiIndex.from_tuples(split_cols, names=["Filter", None])
df = df.stack("Filter").reset_index()
print(df)

   ORD_DATE  Weekday Filter          Pred     Sales
0  1/1/2020        2    ALL  2.081527e+07  48386.27
1  1/1/2020        2  Other  5.643159e+06  15386.27
BallpointBen
  • 9,406
  • 1
  • 32
  • 62
0

I would use pandas.concat with slicing to do this:

import pandas as pd

df = pd.DataFrame({
    'ORD_DATE':['1/1/2020'],
    'Weekday':[2],
    'ALL Sales':[48386.27],
    'ALL Pred':[20815269.83],
    'Other Sales':[15386.27],
    'Other Pred':[5643158.509],
})

df2 = pd.concat([
    df[['ORD_DATE','Weekday','ALL Sales','ALL Pred']].rename(columns={
        'ALL Sales':'Sales',
        'ALL Pred':'Pred',
    }),
    df[['ORD_DATE','Weekday','Other Sales','Other Pred']].rename(columns={
        'Other Sales':'Sales',
        'Other Pred':'Pred',
    }),
])

print(df2)

Output:

   ORD_DATE  Weekday     Sales          Pred
0  1/1/2020        2  48386.27  2.081527e+07
0  1/1/2020        2  15386.27  5.643159e+06

Example in python tutor

Edit With the Filter Column added:

import pandas as pd

df = pd.DataFrame({
    'ORD_DATE':['1/1/2020'],
    'Weekday':[2],
    'ALL Sales':[48386.27],
    'ALL Pred':[20815269.83],
    'Other Sales':[15386.27],
    'Other Pred':[5643158.509],
})

df_all = df[['ORD_DATE','Weekday','ALL Sales','ALL Pred']].rename(columns={
        'ALL Sales':'Sales',
        'ALL Pred':'Pred',
})
df_all['Filter'] = 'All'
df_other = df[['ORD_DATE','Weekday','Other Sales','Other Pred']].rename(columns={
        'Other Sales':'Sales',
        'Other Pred':'Pred',
})
df_other['Filter'] = 'Other'

df2 = pd.concat([df_all,df_other,])

print(df2)

Output:

   ORD_DATE  Weekday     Sales          Pred Filter
0  1/1/2020        2  48386.27  2.081527e+07    All
0  1/1/2020        2  15386.27  5.643159e+06  Other

python tutor link

Phillyclause89
  • 674
  • 4
  • 12
0

Adjust the positions of ALL and Other in the column names, and then reshape the dataframe using pandas' wide to long method.

#reshape column position
#allows manipulation with pd wide to long
df.columns = [' '.join(reversed(ent.split()))
              if ('ALL' in ent) or ('Other' in ent) else ent 
              for ent in df.columns ]

df

    ORD_DATE    Weekday Sales ALL   Pred ALL    Sales Other Pred Other
0   1/1/2020    2   48386.27    20815269.83 15386.27    5643158.509

#reshape data using pandas wide to long
res = (pd.wide_to_long(df,
                       stubnames = ['Sales','Pred'],
                       i = ['ORD_DATE','Weekday'],
                       j = 'Filter',
                       sep = ' ',
                       suffix = '[a-zA-Z]+'
                      )
       .reset_index()
      )

res

    ORD_DATE    Weekday Filter   Sales        Pred
0   1/1/2020       2     ALL    48386.27    2.081527e+07
1   1/1/2020       2    Other   15386.27    5.643159e+06

Another route, inspired by @BallPointPen's solution :

#set first two columns as index
df = df.set_index(['ORD_DATE','Weekday'])

#split column on whitespace
#using the str partition creates a multiindex 
#drop the 1st level, since it contains the whitespace, and is not needed
#assign back to columns
df.columns = df.columns.str.partition(' ').droplevel(1)

#stack the dataframe
res = df.stack(0).reset_index().rename(columns={'level_2':'Filter'})
res

    ORD_DATE    Weekday Filter     Pred         Sales
0   1/1/2020    2       ALL     2.081527e+07    48386.27
1   1/1/2020    2       Other   5.643159e+06    15386.27
sammywemmy
  • 27,093
  • 4
  • 17
  • 31