Join and add double quote in a CSV broken

Question

I tried to read a csv with python but I noticed that the csv is broken and some rows don't have double quote.

Example

ProcessId,nid,CreatedDate,name,uid,Forum,cid,pid,hostname,Change,status,Thread,level,CommentOrder,bundle,deleted,lang,delta,Length_comment_text,field_data_comment_body,topic,isAnswer,CreatedAt,Updatedat,commentText,sticky,CommentRating,CommentRateTimes,Helpfull
1,1031762,2018-01-01 03:42:53,vimo,96977,LoRa®,1031762,0,204.2.166.150,2018-01-01 03:42:53,0,1031762,0,1031762,,0,und,0,1514,comment,How can RN2483 reach -146dBm sensitivity...,0,2018-01-01 03:42:53,2018-01-01 03:42:53,[p]
... if the SX1276 chip inside states to perform -146dBm with a bandwidth of 10.4KHz&#44; while the minimum bandwidth value supported by RN2483 API is 125KHz?

Hi everybody (happy new year!).

According to RN2483 datasheet&#44; its best sensitivity performance in LoRa modulation is -146dBm. However&#44; in order to achieve such a sensitivity&#44; the SX1276 transceiver inside the RN2483 module requires a receive bandwidth of 10.4KHz or less.
Unfortunately&#44; the RN2483 API does only support bw configuration of 500&#44; 250 and 125 KHz. Using a bw value of 125KHz the SX1276 transceiver should be capable to perform a maximum sensitivity of -136dBm.

Am I missing something&#44; or is there actually some trouble with RN2483 datasheet (or command interface) ?

[/p],0,0,0,0

This is the way that I tried to read the csv:

with open(pathCsv, encoding="utf-16") as f:
    csv_reader = csv.DictReader(f)
    for row in csv_reader :
        print(row.get('commentText'))
        break

When I iterate over the csv I had this output:

print(row.get('commentText'))
>> [p]

I would like to know how to join the text again and add the double quote in the text.

This problem arises because your data are split over multiple lines. — DarkKnight, Dec 21 '22 at 16:57
Is There someway to join the multiple lines from the csv ? @Fred — Van Jake, Dec 21 '22 at 17:00
You may need to do some preprocessing to clean this up. I found a pretty [decent suggestion here](https://stackoverflow.com/a/43778440/2221001) on how to pull that off. — JNevill, Dec 21 '22 at 17:01
just to be clear, the file is broken outside of the context of python right? You can't for example open it via libreoffice? — JonSG, Dec 21 '22 at 17:01

score 1 · Answer 1 · answered Dec 21 '22 at 17:07

This will work for the data shown in the question. If the CSV file(s) to be processed contain multiple rows then a different approach will be needed.

from csv import DictReader

with open(pathCsv) as data:
    columns, *content = map(str.strip, data)
    merged = [columns, ''.join(content)]
    for r in DictReader(merged):
        print(r.get('commentText'))

Output:

[p]... if the SX1276 chip inside states to perform -146dBm with a bandwidth of 10.4KHz&#44; while the minimum bandwidth value supported by RN2483 API is 125KHz?Hi everybody (happy new year!).According to RN2483 datasheet&#44; its best sensitivity performance in LoRa modulation is -146dBm. However&#44; in order to achieve such a sensitivity&#44; the SX1276 transceiver inside the RN2483 module requires a receive bandwidth of 10.4KHz or less.Unfortunately&#44; the RN2483 API does only support bw configuration of 500&#44; 250 and 125 KHz. Using a bw value of 125KHz the SX1276 transceiver should be capable to perform a maximum sensitivity of -136dBm.Am I missing something&#44; or is there actually some trouble with RN2483 datasheet (or command interface) ?[/p]

If I have two row: the first is the same as the example but the second row is correct `1,1031766,2018-01-01 04:23:27,happyflyer,121746,MPLAB X IDE,1031766,0,72.249.195.150,2018-01-01 04:23:27,0,1031766,0,1031766,,0,und,0,668,comment,Unexpected Token ICD 3 ,0,2018-01-01 04:23:27,2018-01-01 04:23:27,[p]Hi there,[/p][p] [/p][p]Changing over from ASM to "C " thought I would be clever, programming a 12F675, the code works OK a lot written about[/p][p]but I am still not clear what to do , thanks for any ideas.[/p][p] [/p],0,0,0,0` — Van Jake, Dec 21 '22 at 17:29
@VanJake You'll need to implement a more complex parser for that scenario — DarkKnight, Dec 21 '22 at 17:46

Join and add double quote in a CSV broken

1 Answers1