0

I am having a json file which has the text \u0001\u0000\u0000\u0000FILE in it. When I read this file and write it into a csv, the same text is written as SOHNULNULNULFILE. When I try to read this csv, am getting error like

_csv.Error: line contains NULL byte

I think this is related to some encoding issue.

Guide me how to write and read the csv file as like in the source json without any error.

I am using python 3.6.4

mastisa
  • 1,875
  • 3
  • 21
  • 39
Vanaja Jayaraman
  • 753
  • 3
  • 18
  • 1
    could you post a code snippet for this? – girish946 Jun 20 '18 at 07:44
  • You may find this helpful. This demonstrates how to encode the Unicode to utf-8 while reading csv. https://stackoverflow.com/a/17246997/2580412 – girish946 Jun 20 '18 at 07:47
  • @girish946 I am using python 3.6.4, in which there is no function named encode() as given in the link. And sorry I cannot provide the code snippet also – Vanaja Jayaraman Jun 20 '18 at 08:35
  • What do you want to happen with the null bytes? The csv module can't handle them. Would it be acceptable to filter them out after reading the JSON? – L3viathan Jun 20 '18 at 08:37
  • When I try to read the created CSV file, it shows the error as mentioned above. I want to read that CSV without error and want to create a json file from that. just now found that the string "\u0001\u0000\u0000\u0000FILE" converted to '\x01\x00\x00\x00FILE' by the function json.loads. May be this is the reason why am getting NULL byte error mentioned above – Vanaja Jayaraman Jun 20 '18 at 09:38
  • It is a real problem, but you should have posted a [mcve] to help others to reproduce. – Serge Ballesta Jun 20 '18 at 10:07

1 Answers1

0

That is not an encoding problem. It is just a limitation/bug in the csv module: it is able to write fields containing null bytes, but cannot read them.

An issue is currently opened for that question on the Python bug tracker. Unfortunately it has not been updated for 2 years and has only normal priority.

That means that the only answer is don't. You cannot use the Python csv module to handle fields containing null bytes.

Serge Ballesta
  • 143,923
  • 11
  • 122
  • 252