Python change column names using pd.read_table

Question

I have a txt data named dat.txt, which reads

a    b     c    d     e
99   94   93   100  100
99   88   96    99   97
100   98   81    96  100
93   88   88    99   96
100   91   72    96   78
90   78   82    75   97
75   73   88    97   89
93   84   83    68   88
87   73   60    76   84
95   82   90    62   39
76   72   43    67   78
85   75   50    34   37

I want change the column names (i.e., variable names) of this data. In r, I do

dat = read.table("dat.txt",head=T)
colnames(dat)=c("var1","col2","var3","col4","var5")

which shows

        var1 col2 var3 col4 var5
    1    99   94   93  100  100
    2    99   88   96   99   97
    3   100   98   81   96  100
    4    93   88   88   99   96
    5   100   91   72   96   78
    6    90   78   82   75   97
    7    75   73   88   97   89
    8    93   84   83   68   88
    9    87   73   60   76   84
    10   95   82   90   62   39
    11   76   72   43   67   78
    12   85   75   50   34   37

I want to read this data using Python in a similar way, so I try

 import pandas as pd
 dat1 = pd.read_table('dat.txt')
 print(dat1)

and it shows

         a     b    c    d    e 
    0    99   94   93   100  100
    1    99   88   96    99   97
    2   100   98   81    96  100
    3    93   88   88    99   96
    4   100   91   72    96   78
    5    90   78   82    75   97
    6    75   73   88    97   89
    7    93   84   83    68   88
    8    87   73   60    76   84
    9    95   82   90    62   39
    10   76   72   43    67   78
    11   85   75   50    34   37

Then I try

    cols = ['var1','col2','var3','col4','var5']
    dat2 = pd.read_table('dat.txt', skiprows=[0], header=None, names=cols)
    print(dat2)

it shows

                            var1  col2  var3  col4  var5
    0    99   94   93   100  100   NaN   NaN   NaN   NaN
    1    99   88   96    99   97   NaN   NaN   NaN   NaN
    2   100   98   81    96  100   NaN   NaN   NaN   NaN
    3    93   88   88    99   96   NaN   NaN   NaN   NaN
    4   100   91   72    96   78   NaN   NaN   NaN   NaN
    5    90   78   82    75   97   NaN   NaN   NaN   NaN
    6    75   73   88    97   89   NaN   NaN   NaN   NaN
    7    93   84   83    68   88   NaN   NaN   NaN   NaN
    8    87   73   60    76   84   NaN   NaN   NaN   NaN
    9    95   82   90    62   39   NaN   NaN   NaN   NaN
    10   76   72   43    67   78   NaN   NaN   NaN   NaN
    11   85   75   50    34   37   NaN   NaN   NaN   NaN

I run your code `dat1.columns = cols` , and it follows `ValueError: Length mismatch: Expected axis has 1 elements, new values have 5 elements` Then I try `dat1.columns', which reads **Out[11]: Index([' a b c d e '], dtype='object')**. It seems that, compared to R, `pd.read_table` for Python takes the whole first row just as one variable name. — John Stone, Sep 16 '17 at 10:27

Python change column names using pd.read_table

0 Answers0