I have a tab-delimited text file that is saved in a database field. When I try to parse the text/string content (from the database field), I keep getting the error new-line character seen in unquoted field
.
A lot of SO posts (here and here) deal with reading a file directly and specify with open(path, 'rb') as f
or with open(path, 'rU')
. However, I can't use with open(...)
since I am reading the text/string value from a database record/field.
A simple example demonstrates my problem below.
import csv
s = """X Y A B C D
E
F"""
list(csv.reader([s], delimiter='\t'))[0] # throws error
Conceptually, the line is X\tY\tA\t\B\t\C\tD\rE\rF\r\n
.
What I would expect is ['X', 'Y', 'A', 'B', 'C', 'D\rE\rF']
.
If the field is quoted, then everything works. But I have no control upstream over how these text are generated (impossible to control and re-export). Example below.
s = """X Y A B C "D
E
F"
"""
list(csv.reader([s], delimiter='\t', quotechar='"'))[0]
Any ideas on how I can get this parsing to work?