0

I have the below table taken online through pandas.read_html

      Column0    Column1      Column2      Column3
0     Entry_1    0.685        Record_1     0.69-S$ 0.685
1     Entry_2    0.036        Record_2     0.685
2     Entry_3    05/Jul/2016  Record_3     0.72-S$ 0.4
3     Entry_4    0.338        Record_4     178.8 mm
4     Entry_5    0.41         Record_5     0.06
5     Entry_6    122.48       Record_6     17.29%
6     Entry_7    0.5          Record_7     0.58 as of 05/Jul/2016

How do I pviot / transpose this data such that Column 0 becomes the headers, and Column 1 becomes the values. Similarly for Column 2 and Column 3?

jake wong
  • 4,909
  • 12
  • 42
  • 85
  • These are 2 dataframes? You want to merge them and then transpose the merged dataframe? – Joe T. Boka Jul 06 '16 at 07:10
  • Nope. This is in 1 dataframe. This was the code I got when I printed the output from `pandas.read_html`. I want to make `Columns 0` and `Columns 2` the headers for the values in `Columns 1` and `Columns 3` – jake wong Jul 06 '16 at 07:12

3 Answers3

2

This is probably the easiest way to solve this problem. The easiest way I could come up with anyway.

Column0     Column1     Column2     Column3
0   Entry_1     0.685   Record_1    0.69-S$ 0.685
1   Entry_2     0.036   Record_2    0.685
2   Entry_3     05/Jul/2016     Record_3    0.72-S$ 0.4
3   Entry_4     0.338   Record_4    178.8 mm
4   Entry_5     0.41    Record_5    0.06
5   Entry_6     122.48  Record_6    17.29%
6   Entry_7     0.5     Record_7    0.58 as of 05/Jul/2016

cols = df['Column0'].append(df['Column2'])
vals = df['Column1'].append(df['Column3'])

newdf = pd.DataFrame(vals).T
newdf.columns = cols
newdf



 Entry_1    Entry_2     Entry_3     Entry_4     Entry_5     Entry_6     Entry_7     Record_1    Record_2    Record_3    Record_4    Record_5    Record_6    Record_7
0   0.685   0.036   05/Jul/2016     0.338   0.41    122.48  0.5     0.69-S$ 0.685   0.685   0.72-S$ 0.4     178.8 mm    0.06    17.29%  0.58 as of 05/Jul/2016
Joe T. Boka
  • 6,554
  • 6
  • 29
  • 48
1

You can use lreshape for create new columns, then set_index from Col, transpose by T and last remove column name by rename_axis (new in pandas 0.18.0):

print (pd.lreshape(df, {'Col':['Column0', 'Column2'], 
                        0:['Column1', 'Column3']})
         .set_index('Col')
         .T
         .rename_axis(None, axis=1))

  Entry_1 Entry_2      Entry_3 Entry_4 Entry_5 Entry_6 Entry_7       Record_1  \
0   0.685   0.036  05/Jul/2016   0.338    0.41  122.48     0.5  0.69-S$ 0.685   

  Record_2     Record_3  Record_4 Record_5 Record_6                Record_7  
0    0.685  0.72-S$ 0.4  178.8 mm     0.06   17.29%  0.58 as of 05/Jul/2016  
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
0

I suggest you use the method DataFrame.pivot as the following example:

import pandas as pd
import numpy as np
import pandas.util.testing as tm; tm.N = 3
def unpivot(frame):
    N, K = frame.shape
    data = {'value' : frame.values.ravel('F'),
            'variable' : np.asarray(frame.columns).repeat(N),
            'date' : np.tile(np.asarray(frame.index), K)}
    return pd.DataFrame(data, columns=['date', 'variable', 'value'])
df = unpivot(tm.makeTimeDataFrame())
print (df)

print (df.pivot(index='date', columns='variable', values='value'))

print(df):

date variable value
0 2000-01-03 A 0.101495
1 2000-01-04 A -0.554863
2 2000-01-05 A -0.345271
3 2000-01-03 B -1.104909
4 2000-01-04 B -0.723819
5 2000-01-05 B 0.088401
6 2000-01-03 C 1.495768
7 2000-01-04 C -0.756166
8 2000-01-05 C -0.266072
9 2000-01-03 D 0.451050
10 2000-01-04 D -1.457763
11 2000-01-05 D 0.945552

print (df.pivot(index='date', columns='variable', values='value')):

variable A B C D
date
2000-01-03 2.932572 -1.959961 0.385705 -1.629831
2000-01-04 -0.317548 0.031041 2.129526 -1.717546
2000-01-05 0.108186 1.182527 0.997716 0.453127

Gal Dreiman
  • 3,969
  • 2
  • 21
  • 40