0

I have a CSV file that has some data in it. I want to replace all the newlines within "" by some character. But the new lines outside of these quotes should stay. What is the best way to achieve this?

import sys, getopt

def main(argv):
    inputfile = ''
    outputfile = ''

    print(argv[0:])
    inputfile = argv[0:]

    file_object = open(argv[0:], "r")
    print(file_object)

    data = file.read(file_object)
    strings = data.split('"')[1::2]

    for string in strings:
        string.replace("\r", "")
        string.replace("\n", "")
        print(string)

    f = open("output.csv", "w")
    for string in strings:
        string = string.replace("\r", "")
        string = string.replace("\n", "")
        f.write(string)

    f.close()


if __name__ == "__main__":
    main(sys.argv[1])

This does not quite work, since the "" get lost as well as the ,'s.

Expected input:

“dssdlkfjsdfj   \r\n ashdiowuqhduwqh \r\n”,
 "3"

Expected output:

"dssdlkfjsdfj    ashdiowuqhduwqh",
 "3"
Athylus
  • 193
  • 1
  • 1
  • 10
  • This line `data = file.read(file_object)` what is `file`? Did you mean `data = file_object.read()`? – TrebledJ Dec 03 '18 at 10:37
  • 1
    Please also [provide sample input, sample output along with expected output](https://stackoverflow.com/help/mcve). – TrebledJ Dec 03 '18 at 10:40
  • I am using python 3.6, not sure if that works the same. But it basically opens the file and reads it as a string. – Athylus Dec 03 '18 at 10:53
  • The unicode character for your first set of quotation marks is not the standard ASCII quotation mark character. `“ ”` vs `""` This may cause hindrances for your `data.split('"')`. – TrebledJ Dec 03 '18 at 10:57
  • Are the \r\n characters literal newlines (i.e. in Python, would be matched with `'\\r\\n'`) or are the newlines *actual newlines* (i.e. there's supposed to be a line break in the text file)? – TrebledJ Dec 03 '18 at 11:09
  • Of course, I meant " only. – Athylus Dec 03 '18 at 15:11
  • Of course, if you provide a "real" cut-n-paste sample of the data instead of `asdfasdfasdf` and mis-typed quotes, we could provide a real solution. The `csv` module can help. – Mark Tolonen Dec 03 '18 at 17:22

2 Answers2

1

A real sample would help, but given in.csv:

"multi
line
data","more data"
"more multi
line data","other data"

The following will replace newlines in quotes:

import csv

with open('in.csv',newline='') as fin:
    with open('out.csv','w',newline='') as fout:
        r = csv.reader(fin)
        w = csv.writer(fout)
        for row in r:
            row = [col.replace('\r\n','**') for col in row]
            w.writerow(row)

out.csv:

multi**line**data,more data
more multi**line data,other data
Mark Tolonen
  • 166,664
  • 26
  • 169
  • 251
0

The problem got solved in a very easy way. Create an output file, and read the input file for each character. Write each character to the output file, but toggle replace mode by using the ~ operator when a " appears. When in replace mode, replace all \r\n with '' (nothing).

Athylus
  • 193
  • 1
  • 1
  • 10