0

I've attached two Excel screenshots illustrating my problem. I have data in a spreadsheet organized like this (sort of pivot-y):

enter image description here

...and I want to transform it (unpivot) into a vertical format like this:

enter image description here

Please note that a simple transpose will not solve this. If you look closely at the Excel file (specifically the 'How I Want It' tab), you'll see there's actually a total of 5 dimensions I'm dealing with.

Any guidance on approaches? Really hoping this can be done with Pandas.. not so sure it can be.

Link to Excel file here:

https://drive.google.com/open?id=1hZLQPukHcJwfwmkoPwOPPv5_JViwgL6v

idiocrash
  • 41
  • 6
  • 1
    It looks like you want a transpose? – cs95 Nov 08 '17 at 07:14
  • can you share your sample file? – jezrael Nov 08 '17 at 07:16
  • Essentially, yes but I need to transpose from multiple, nested dimensions here. The index is actually defined across multiple row/column attributes. Hence my problem. – idiocrash Nov 08 '17 at 07:18
  • @cᴏʟᴅsᴘᴇᴇᴅ - I reopen answer because solution is a bit complicated, not transpose only. – jezrael Nov 08 '17 at 07:22
  • @idiocrash - Maybe the main problem is convert excel to DataFrame with MultiIndex, if my solution does not work is possible share your sample excel file? gdocs, dropbox, wetransfer... – jezrael Nov 08 '17 at 07:24
  • @jezrael For future reference, if a question is closed incorrectly, please attempt to search for the correct one and edit the list. I'm 90% sure there is a duplicate for this question, but you would rather answer it than have it closed... – cs95 Nov 08 '17 at 07:29
  • @cᴏʟᴅsᴘᴇᴇᴅ - please check comment above... – jezrael Nov 08 '17 at 07:30
  • 1
    @jezrael You're free to do what you want, but I have a word of advice... if a question is not in a state that writing an answer is guaranteed to solve the OP's problem (as you yourself admit you're not sure), then leave it closed! Or, at least, vote to close as "too broad"/"no mcve". – cs95 Nov 08 '17 at 07:31
  • @cᴏʟᴅsᴘᴇᴇᴅ - you know how it think - if excel is ideal, then my solution working nice, so answer was added. But if necessary convert unnamed `MultiIndex` values it is not possible without file... – jezrael Nov 08 '17 at 07:34
  • Here's a link to the Excel: https://drive.google.com/open?id=1hZLQPukHcJwfwmkoPwOPPv5_JViwgL6v – idiocrash Nov 08 '17 at 07:35
  • Hi all, I've updated the question to be more clear. – idiocrash Nov 08 '17 at 07:43

1 Answers1

1

I think you need unstack + reset_index for reshape data:

#create `MultiIndex` DataFrame
df = pd.read_excel('HowItIs.xlsx', header=[0,1], index_col=[0])
#print (df)

#get names for new columns values
a, b = df.columns.names

d = {'level_2':'Category', 'Section Subheading':'Mesurement_Period_Week',
      'SECTION TITLE':'Mesurement_Period'}

fin_order = ['Section_Title','Section_Subheading','Category','Mesurement_Period',
            'Mesurement_Period_Week','Value']
df = (df.unstack()
        .reset_index(name='Value')
        .rename(columns=d)
        .sort_values(['Category','Mesurement_Period','Mesurement_Period_Week'])
        .assign(Section_Title = a, Section_Subheading=b)
        .reset_index(drop=True)
        [fin_order]
        )
print (df.head())
   Section_Title  Section_Subheading Category    Mesurement_Period  \
0  SECTION TITLE  Section Subheading   Apples   FY15 P03 Dec Count   
1  SECTION TITLE  Section Subheading   Apples   FY15 P03 Dec Count   
2  SECTION TITLE  Section Subheading   Apples   FY15 P03 Dec Count   
3  SECTION TITLE  Section Subheading   Apples   FY15 P03 Dec Count   
4  SECTION TITLE  Section Subheading   Apples  FY15 P03 Dec Weight   

  Mesurement_Period_Week  Value  
0                     W1     53  
1                     W2     53  
2                     W3     53  
3                     W4     53  
4                     W1   2120  

Last if need increment each value in Value column per groups:

df['Value'] += df.groupby(['Category','Mesurement_Period']).cumcount()
print (df.head())
   Section_Title  Section_Subheading Category    Mesurement_Period  \
0  SECTION TITLE  Section Subheading   Apples   FY15 P03 Dec Count   
1  SECTION TITLE  Section Subheading   Apples   FY15 P03 Dec Count   
2  SECTION TITLE  Section Subheading   Apples   FY15 P03 Dec Count   
3  SECTION TITLE  Section Subheading   Apples   FY15 P03 Dec Count   
4  SECTION TITLE  Section Subheading   Apples  FY15 P03 Dec Weight   

  Mesurement_Period_Week  Value  
0                     W1     53  
1                     W2     54  
2                     W3     55  
3                     W4     56  
4                     W1   2120  
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
  • I don't think this quite addresses the depth of the nesting in the Excel. There's actually 5 dimensions. Tricky. – idiocrash Nov 08 '17 at 07:56
  • I posted the same Excel doc as the screenshots show: https://drive.google.com/open?id=1hZLQPukHcJwfwmkoPwOPPv5_JViwgL6v – idiocrash Nov 08 '17 at 08:02
  • Yes, real data follows the same structure as the example data. – idiocrash Nov 08 '17 at 08:09
  • Super, please five me some time. – jezrael Nov 08 '17 at 08:09
  • Do you need increment 1 to each Value per group? – jezrael Nov 08 '17 at 08:22
  • ...I just realized that my 'How I Want It' tab has incremented values (left over from dragging as a "series" in Excel), but that was due to me not paying attention late at night so the increment step was not needed. In any event, you solved my main problem and one extra one that I mistakenly introduced! Thank you so much! – idiocrash Nov 08 '17 at 17:44