0

Can you please help me with the following. I have the following pandas df:

             FB  AMZN  AAPL  NFLX  GOOG
   date                                  
 2004-01-01  10     4     1     7     0
 2004-02-01   4     0     0     0    23
 2004-03-01   6     0     0     0    34
 2004-04-01   0     0     0     0     0
 2004-05-01   0     0     0     0     0

Instead of the columns as in the above df, I want to have three columns: Companym, date and Score, Particularly, how can I make the pd DF in the following template:

Company    date   Score
FB      01.01.2004  10
FB      01.02.2004  4
FB      01.03.2004  6
FB      01.04.2004  0
FB      01.05.2004  0
AMZN    01.01.2004  4
AMZN    01.02.2004  0
AMZN    01.03.2004  0
AMZN    01.04.2004  0
AMZN    01.05.2004  0
AAPL    01.01.2004  1
AAPL    01.02.2004  0
AAPL    01.03.2004  0
AAPL    01.04.2004  0
AAPL    01.05.2004  0
NFLX    01.01.2004  7
NFLX    01.02.2004  0
NFLX    01.03.2004  0
NFLX    01.04.2004  0
NFLX    01.05.2004  0
GOOG    01.01.2004  0
GOOG    01.02.2004  23
GOOG    01.03.2004  34
GOOG    01.04.2004  0
GOOG    01.05.2004  0
          
Alberto Alvarez
  • 805
  • 3
  • 11
  • 20

3 Answers3

2

here is one way to do it

Assuming the date is the index

#melt the dataframe
df=df.melt(  var_name='Company', value_name='Score', ignore_index=False)

# reformat the date
df.index=pd.to_datetime(df.index).strftime('%d.%m.%Y')
df


            Company     Score
      date      
01.01.2004       FB     10
01.02.2004       FB     4
01.03.2004       FB     6
01.04.2004       FB     0
01.05.2004       FB     0
01.01.2004     AMZN     4
01.02.2004     AMZN     0
01.03.2004     AMZN     0
01.04.2004     AMZN     0
01.05.2004     AMZN     0
01.01.2004     AAPL     1
01.02.2004     AAPL     0
01.03.2004     AAPL     0
01.04.2004     AAPL     0
01.05.2004     AAPL     0
01.01.2004     NFLX     7
01.02.2004     NFLX     0
01.03.2004     NFLX     0
01.04.2004     NFLX     0
01.05.2004     NFLX     0
01.01.2004     GOOG     0
01.02.2004     GOOG     23
01.03.2004     GOOG     34
01.04.2004     GOOG     0
01.05.2004     GOOG     0

Naveed
  • 11,495
  • 2
  • 14
  • 21
1

Just unstack and rename the columns:

new = df.unstack().reset_index()
new.columns = ['Company', 'Date', 'Score']

   Company        Date  Score
0       FB  2004-01-01     10
1       FB  2004-02-01      4
2       FB  2004-03-01      6
3       FB  2004-04-01      0
4       FB  2004-05-01      0
5     AMZN  2004-01-01      4
6     AMZN  2004-02-01      0
7     AMZN  2004-03-01      0
8     AMZN  2004-04-01      0
9     AMZN  2004-05-01      0
10    AAPL  2004-01-01      1
11    AAPL  2004-02-01      0
12    AAPL  2004-03-01      0
13    AAPL  2004-04-01      0
14    AAPL  2004-05-01      0
15    NFLX  2004-01-01      7
16    NFLX  2004-02-01      0
17    NFLX  2004-03-01      0
18    NFLX  2004-04-01      0
19    NFLX  2004-05-01      0
20    GOOG  2004-01-01      0
21    GOOG  2004-02-01     23
22    GOOG  2004-03-01     34
23    GOOG  2004-04-01      0
24    GOOG  2004-05-01      0
It_is_Chris
  • 13,504
  • 2
  • 23
  • 41
0

you can use melt:

out = df.reset_index().melt(id_vars=['date'], var_name='company', value_name='score')

output:

>>
          date company  score
0   2004-01-01      FB     10
1   2004-02-01      FB      4
2   2004-03-01      FB      6
3   2004-04-01      FB      0
4   2004-05-01      FB      0
5   2004-01-01    AMZN      4
6   2004-02-01    AMZN      0
7   2004-03-01    AMZN      0
8   2004-04-01    AMZN      0
9   2004-05-01    AMZN      0
10  2004-01-01    AAPL      1
11  2004-02-01    AAPL      0
12  2004-03-01    AAPL      0
13  2004-04-01    AAPL      0
14  2004-05-01    AAPL      0
15  2004-01-01    NFLX      7
16  2004-02-01    NFLX      0
17  2004-03-01    NFLX      0
18  2004-04-01    NFLX      0
eshirvana
  • 23,227
  • 3
  • 22
  • 38