1

I am building a web-app where I can upload a JSON file, update it, then download it. The output JSON is not valid because some characters changed through the process. I don't know where I'm wrong because even when I only do upload => download without updates the JSON is still not valid...

This is how I read the uploaded JSON:

readFile: function () {
  var reader = new FileReader();
  reader.onload = function(event) {
    this.json = JSON.parse(event.target.result);
  }.bind(this);
  reader.readAsText(this.file);
}

Then I can edit (or not) the json object. Then I can download it with JSON.stringify(json).

When I try to read or validate the output JSON I get errors signaling invalid characters, for example:

  • Invalid characters in string. Control characters must be escaped for some lines in my editor.
  • UnicodeDecodeError: ‘utf-8’ codec can’t decode byte 0xac in position X: invalid start byte when I try to load it in python with open('output.json') as json_file: data = json.load(json_file)

Does using JSON.parse then JSON.stringify modifies the encoding or structure of the JSON? How can I avoid this effect?

UPDATE:

Original file can have some characters like \u2013, \u2014, \u201d, \u00e7 but those characters are transformed into things like this � or invisible characters in the output JSON, which I guess make it not valid.

Boussadjra Brahim
  • 82,684
  • 19
  • 144
  • 164
erup
  • 183
  • 3
  • 12

1 Answers1

2

Try to add 'UTF-8' as a second parameter to the readAsText function as follows :

   reader.readAsText(this.file,'UTF-8');
Boussadjra Brahim
  • 82,684
  • 19
  • 144
  • 164
  • Unfortunatelly didn't work, I guess because utf-8 is already the default encoding value of `readAsText()` – erup May 09 '19 at 18:25