0

I have a dataset where I would like to pivot the entire dataframe, using certain columns as values.

Data

id  date    sun moon    stars   total   pcp base    final   status  space   galaxy
aa  Q1 21   5   1       2       8       0   200     41      5       1       1
aa  Q2 21   4   1       2       7       1   200     50      6       2       1

Desired

id  date    type    pcp base    final   final2  status  type2   final3
aa  Q1 21   sun     0   200     41      5       5       space   1
aa  Q1 21   moon    0   200     41      1       5       galaxy  1
aa  Q1 21   stars   0   200     41      2       5       space   1
aa  Q2 21   sun     1   200     50      4       6       space   2
aa  Q2 21   moon    1   200     50      1       6       galaxy  1
aa  Q2 21   stars   1   200     50      2       6       space   2

Doing

df.drop(columns='total').melt(['id','date','final','final2','base','ppp'],var_name='type',value_name='ppp')

This works well in pivoting the first set of values (sun, moon etc) however, not sure how to incorporate the second 'set' space and galaxy. Any suggestion is appreciated

sammywemmy
  • 27,093
  • 4
  • 17
  • 31
Lynn
  • 4,292
  • 5
  • 21
  • 44

1 Answers1

2

This is a partial answer:

cols = ['id', 'date', 'pcp', 'base', 'final', 'status']

df = df.drop(columns='total')

df1 = df.melt(id_vars=cols, value_vars=['sun', 'moon', 'stars'], var_name='type')
df2 = df.melt(id_vars=cols, value_vars=['galaxy', 'space'], var_name='type2')

out = pd.merge(df1, df2, on=cols)

At this point, your dataframe looks like:

>>> out
    id   date  pcp  base  final  status   type  value_x   type2  value_y
0   aa  Q1 21    0   200     41       5    sun        5  galaxy        1
1   aa  Q1 21    0   200     41       5    sun        5   space        1
2   aa  Q1 21    0   200     41       5   moon        1  galaxy        1
3   aa  Q1 21    0   200     41       5   moon        1   space        1
4   aa  Q1 21    0   200     41       5  stars        2  galaxy        1
5   aa  Q1 21    0   200     41       5  stars        2   space        1
6   aa  Q2 21    1   200     50       6    sun        4  galaxy        1
7   aa  Q2 21    1   200     50       6    sun        4   space        2
8   aa  Q2 21    1   200     50       6   moon        1  galaxy        1
9   aa  Q2 21    1   200     50       6   moon        1   space        2
10  aa  Q2 21    1   200     50       6  stars        2  galaxy        1
11  aa  Q2 21    1   200     50       6  stars        2   space        2

Now the question is how total3 is set to reduce the dataframe?

Corralien
  • 109,409
  • 8
  • 28
  • 52