0

I am trying to read a JSON file with Python. This file is described by the authors as not strict JSON. In order to convert it to strict JSON, they suggest this approach:

import json

def parse(path):
    g = gzip.open(path, 'r')
    for l in g:
        yield json.dumps(eval(l))

however, not being familiar with Python, I am able to execute the script but I am not able to produce any output file with the new clean JSON. How should I modify the script in order to produce a new JSON file? I have tried this:

import json

class Amazon():

    def parse(self, inpath, outpath):
        g = open(inpath, 'r')
        out = open(outpath, 'w')
        for l in g:
            yield json.dumps(eval(l), out)

amazon = Amazon()
amazon.parse("original.json", "cleaned.json")

but the output is an empty file. Any help more than welcome

user299791
  • 2,021
  • 3
  • 31
  • 57
  • Assuming that `json.dumps(eval(l))` returns a string, change `yield json.dumps(eval(l), out)` to `out.write(json.dumps(eval(l)))`. Make sure you really really really trust your input. With `eval` you are running arbitrary data as code. See [Is Using eval In Python A Bad Practice?](http://stackoverflow.com/questions/1832940/is-using-eval-in-python-a-bad-practice) – Steven Rumbalski Apr 29 '15 at 14:52

2 Answers2

1
import json

class Amazon():

    def parse(self, inpath, outpath):
        g = open(inpath, 'r')
        with open(outpath, 'w') as fout:
            for l in g:
                fout.write(json.dumps(eval(l)))

amazon = Amazon()
amazon.parse("original.json", "cleaned.json")
Ionut Hulub
  • 6,180
  • 5
  • 26
  • 55
1

another shorter way of doing this

import json

class Amazon():
    def parse(readpath, writepath):
        with open(readpath) as g, open(writepath, 'w') as fout:
            for l in g:
                json.dump(eval(l), fout)

amazon = Amazon()
amazon.parse("original.json", "cleaned.json")

While handling json data it is better to use json modules json.dump(json, output_file) for dumping json in file and json.load(file_path) to load the data. In this way you can get maintain json wile saving and reading json data.

For very large amount of data say 1k+ use python pandas module.

Amit Tripathi
  • 7,003
  • 6
  • 32
  • 58
  • thanks, yes a shorter way but that class is going to grow so I prefer that approach... you should change writefile to writepath – user299791 Apr 29 '15 at 15:13
  • 1
    Well, I was talking about the content of function. You can insert this function in your class. I would suggest you to use this function because it help you to get get json data when you read the file again. – Amit Tripathi Apr 29 '15 at 15:17
  • thanks for the clarification, I am so newbie to Python that I don't really understand your point... on a minor note, I think you have to change open(writefile, 'w') to open(writepath, 'w') – user299791 Apr 29 '15 at 15:25