Converting a table of fixed width in text format into dataframe/excel/csv

Question

I have some data in txt format with 38 columns which looks like this:

screenshot

With the exception of the header row, most of the rows have missing values. I want to convert this table into an array/ dataframe/ excel. But it is not coming as it looks in table.

I tried using python

df = pandas.read_csv(filename, sep='\s+',names=colnames, header=None)

I am confused about what seperator to use.

The program should look for value after single space. If no value is present, fill it with nan. How to do that?
Thanks in advance!

Using `pd.read_csv(filename, sep='\s+')` works fine on my computer. Which version of pandas are you using? I'm using `0.24.2` — tdpr, Feb 07 '20 at 15:02
@tdpr I am using pandas `0.25.1` . For me, the above command got executed without giving any error but R/F col values coming to AW, RA col values to R/F and so on as `sep='\s+'` is looking for one or more spaces. — Manojk07, Feb 10 '20 at 05:15

score 2 · Accepted Answer · answered Feb 07 '20 at 15:05

You can use pandas.read_fwf (fixed-width format):

>>> df = pd.read_fwf('data.txt')
>>> df
    INDEX  YEAR  MN  DT   MAX   MIN  ...  T.2  G.2  DUR.2  T.3  G.3  DUR.3
0   14210  1972   9   1  32.0  22.0  ...  NaN  NaN    NaN  NaN  NaN    NaN
1   14210  1972   9   2  32.3  21.5  ...  NaN  NaN    NaN  NaN  NaN    NaN
2   14210  1972   9   3  32.8  22.4  ...  NaN  NaN    NaN  NaN  NaN    NaN
3   14210  1972   9   4  32.0  22.0  ...  NaN  NaN    NaN  NaN  NaN    NaN
4   14210  1972   9   5  33.2  23.6  ...  0.0  7.0  280.0  NaN  NaN    NaN
5   14210  1972   9   6  31.6  23.2  ...  5.0  8.0   45.0  0.0  8.0    NaN
6   14210  1972   9   7  31.5  21.0  ...  5.0  4.0   45.0  NaN  NaN    NaN
7   14210  1972   9   8  29.7  21.6  ...  NaN  NaN    NaN  NaN  NaN    NaN
8   14210  1972   9   9  29.7  21.1  ...  NaN  NaN    NaN  NaN  NaN    NaN
9   14210  1972   9  10  27.6  21.5  ...  NaN  NaN    NaN  NaN  NaN    NaN
10  14210  1972   9  11  30.3  21.3  ...  6.0  1.0   80.0  NaN  NaN    NaN
11  14210  1972   9  12  30.6  22.0  ...  5.0  5.0   30.0  NaN  NaN    NaN
12  14210  1972   9  13  30.2  21.4  ...  0.0  7.0  195.0  NaN  NaN    NaN
13  14210  1972   9  14  28.2  21.5  ...  NaN  NaN    NaN  NaN  NaN    NaN
14  14210  1972   9  15  30.3  21.9  ...  0.0  7.0  305.0  NaN  NaN    NaN
15  14210  1972   9  17  32.0  22.0  ...  6.0  7.0  135.0  NaN  NaN    NaN
16  14210  1972   9  18  32.0  20.5  ...  6.0  6.0   80.0  5.0  NaN    NaN

[17 rows x 38 columns]

Converting a table of fixed width in text format into dataframe/excel/csv

1 Answers1