Best pandas way to parse table-like data where separation between columns is varying number of whitespaces?

Question

I am wondering if there is an efficient way to read this census-related data (https://www2.census.gov/topics/genealogy/1990surnames/dist.all.last) into a pandas data table directly? So far the only way I can think of to parse the columns is to read each line individually, apply .split() and then use that list to create a data table. This seems like something pandas would have dealt with, but I don't know how.

`pandas.read_table(url, sep='\s+')` – Paul H May 22 '17 at 02:00 — Paul H, May 22 '17 at 02:00

score 0 · Answer 1 · edited May 23 '17 at 12:10

0

For single space

import pandas as pd pd.read_csv('https://www2.census.gov/topics/genealogy/1990surnames/dist.all.last', sep=' ',header=None)

For multiple spaces (here)

pd.read_csv('https://www2.census.gov/topics/genealogy/1990surnames/dist.all.last', sep='\s+',header=None)

Also see Read Space-separated Data with Pandas

edited May 23 '17 at 12:10

Community

1
1

answered May 22 '17 at 06:30

Ajay Ohri

3,382
3
30
60

Best pandas way to parse table-like data where separation between columns is varying number of whitespaces?

1 Answers1