0

I have a info.txt file it looks like this:

B 19960331 00100000 00000000000000 00000000000000 00000000000000 00000000 00000000000000 00000000000000 00000000000000
B 19960430 00099100 00000000000000 00000000000000 00000000000000 00000000 00000000000000 00000000000000 00000000000000
B 19960531 00098500 00000000000000 00000000000000 00000000000000 00000000 00000000000000 00000000000000 00000000000000
B 19971000 20 31

And when I use pandas to read it:

import pandas as pd
import numpy as np
df =pd.read_csv('C:\Users\Petter\Desktop\info.txt',sep=r"\s", header=None, dtype=str, engine="python")
df

the error is:

ParserError: Expected 10 fields in line 153, saw 14. Error could possibly be due to quotes being ignored when a multi-char delimiter is used.

Is there any way to automatically fill the row that not the same column length, the output should looks like:

0   1   2   3   4   5   6   7   8   9
0   B   19960331    00100000    00000000000000  00000000000000  00000000000000  00000000    00000000000000  00000000000000  00000000000000
1   B   19960430    00099100    00000000000000  00000000000000  00000000000000  00000000    00000000000000  00000000000000  00000000000000
2   B   19971000    20          31              None            None            None  None None None
 

I mean every blank column will be fill with None

William
  • 3,724
  • 9
  • 43
  • 76
  • 1
    I cannot reproduce, `pd.read_csv("file.txt", sep=r"\s", header=None, dtype=str, engine="python")` produces expected output – Andrej Kesely Jun 22 '21 at 17:46
  • Thank you for reply.The info.txt has been changed please check.Thanks! – William Jun 22 '21 at 18:06
  • Hi sir, thank you so much for your help,can you help with this question:Hi sir,can you help with this question https://stackoverflow.com/questions/68090116/pandas-expected-10-fields-in-line-153-saw-11-how-to-add-one-more-column – William Jun 22 '21 at 20:04

1 Answers1

2

This works, and should(?) be the same as reading the file from disk:

import pandas as pd
import io

my_file = io.StringIO("""B 19960331 00100000 00000000000000 00000000000000 00000000000000 00000000 00000000000000 00000000000000 00000000000000
B 19960430 00099100 00000000000000 00000000000000 00000000000000 00000000 00000000000000 00000000000000 00000000000000
B 19960531 00098500 00000000000000 00000000000000 00000000000000 00000000 00000000000000 00000000000000 00000000000000
B 19971000 20 31""")

df = pd.read_csv(my_file, sep="\s+", header=None)

output:

   0         1       2   3    4    5    6    7    8    9
0  B  19960331  100000   0  0.0  0.0  0.0  0.0  0.0  0.0
1  B  19960430   99100   0  0.0  0.0  0.0  0.0  0.0  0.0
2  B  19960531   98500   0  0.0  0.0  0.0  0.0  0.0  0.0
3  B  19971000      20  31  NaN  NaN  NaN  NaN  NaN  NaN
anon01
  • 10,618
  • 8
  • 35
  • 58
  • Hi sir,can you help with this question https://stackoverflow.com/questions/68090116/pandas-expected-10-fields-in-line-153-saw-11-how-to-add-one-more-column – William Jun 22 '21 at 20:04
  • Hi @anon01 friend,can you help me with this question https://stackoverflow.com/questions/68309137/how-to-cross-checking-2-pandas-dataframes-file-and-use-1-dataframes-value-as-a – William Jul 08 '21 at 22:10