0

I need to read in lots of sounding data that the columns in those data have different length (see the attached sounding file https://drive.google.com/open?id=1GVaKdp6J47nS_yDc1LEDnAAxeGF7lpZT): the first 3 rows have only 3 columns while the rest of rows have 11 columns

I tried reading them using pandas.read_csv:

data = pd.read_csv('DNR-910.txt', skiprows = 5, header = None, delimiter= '\s+')

and I got: ParserError: Error tokenizing data. C error: Expected 2 fields in line 9, saw 11

My problem is actually similar to this post : read txt file with different number of columns, but it is solved in R

harmony
  • 111
  • 1
  • 9
  • See https://stackoverflow.com/questions/55129640/read-csv-into-a-dataframe-with-varying-row-lengths-using-pandas Read in each line into a single field, then have pandas `str.split` expand to the correct DataFrame. There are other great alternatives in that post too. I suggest making as a duplicate of that one if something there solves your issue. – ALollz Sep 11 '19 at 02:46
  • Yeah, thanks, that helps! I figured if I let Pandas know in advance the number of columns, it will auto fill the missing values with NaNs – harmony Sep 11 '19 at 02:53
  • Yup, I think that's the best alternative here since you seem to already know the max number of columns :D – ALollz Sep 11 '19 at 02:54

0 Answers0