0

I'm trying to successfully extract a column in a CSV file that has text like the following:

2999,29383,Here is some text,"None",2016-03-18 13:26:42,"Jackson: "Hai"

Jason: "Thx bby bai"

#Living"

I am trying to extract the final column that begins with Jackson. As you can see, the text contains quotation marks that start at Jackson and then are supposed to end at #Living, which delineate the beginning and ending of that column, but the problem is that the text also has quotation marks within that column. This is leading to csv.reader to interpret these statements as new lines. There is a number of different times this happens within the CSV file so I would need help fixing all of those potential problems as well.

Matt
  • 113
  • 3
  • 10
  • 1
    The quotation marks need to be properly escaped. In CSVs, double quotes are escaped by placing 2 consecutively. `""` For example, `1,Example,"This is a long ""string"" of data with ""additional"" quotation marks.",421` http://stackoverflow.com/questions/17808511/properly-escape-a-double-quote-in-csv – Brandon Anzaldi Apr 08 '16 at 23:50
  • Have you tried something like pandas? – CinchBlue Apr 08 '16 at 23:56

1 Answers1

0

for (correction of escape sequence)

2999,29383,Here is some text,"None",2016-03-18 13:26:42,"Jackson: ""Hai""

Jason: ""Thx bby bai""

#Living"

I used this code

import csv
with open('/tmp/test', 'rb') as csvfile:
    spamreader = csv.reader(csvfile, delimiter=',', quotechar='"')
    for row in spamreader:
        print(row[5])

and my output is

Jackson: "Hai"

Jason: "Thx bby bai"

#Living
Samuel LEMAITRE
  • 1,041
  • 7
  • 8
  • This requires adding two double quotes in the source file (around Hai and Thx...)? Without adding these double quotes manually I tried this approach and it didn't seem to work. The trouble is that I need a way to fix not just this particular instance, but a large number of similar instances across the file. – Matt Apr 09 '16 at 00:59
  • i will try to find a way to reformat properly a file – Samuel LEMAITRE Apr 09 '16 at 01:13