-1

I have the below code to iterate through my CSV values. Input data (Sample.csv):

name,city
jack,nj
matt,ny

and create output in JSON. Required output

[
{"name": "jack","city": "PA"},
{"name": "matt","city": "CA"}
]

Output from code:

[{"name,city": "jack,PA"};{"name,city": "matt,CA"};]

Code sample:

#!/usr/bin/python

import json
import csv
csvfile = open('sample.csv', 'r')
jsonfile = open('sample.csv'.replace('.csv','.json'), 'w')

jsonfile.write('{\n[\n')
fieldnames = csvfile.readline().replace('\n','').split(';')
reader = csv.DictReader(csvfile, fieldnames, delimiter=';')

from collections import OrderedDict
  for row in reader:  
    json.dump(OrderedDict([(f, row[f]) for f in fieldnames]), jsonfile, indent=4)
    jsonfile.write(';\n')
    jsonfile.write(']\n}')

Final output is not aligning into key value pair.

jonrsharpe
  • 115,751
  • 26
  • 228
  • 437
jb04
  • 79
  • 9
  • FYI you don't need to deal with `fieldnames` directly if the first line of the CSV file *is* the field names. Also it's not clear why you are manually mangling the JSON in the output file. Also, given that the delimiter is clearly `,`, why do you keep using `;`?! – jonrsharpe Jan 26 '17 at 21:15
  • I am new to Python, I tried other examples but that is appending value to a list which is taking lot of time when converting files over 1 GB. Instead I would like to append to the json output file instead of keeping it in memory. This is the code that got me closer to what I needed Other Solution: http://stackoverflow.com/a/32158933/884808 – jb04 Jan 26 '17 at 21:20
  • But you're closing the array after every item in it, and inexplicably using semicolons within it. If you're going to manually write JSON, I'd recommend being familiar with the valid syntax. – jonrsharpe Jan 26 '17 at 21:22

1 Answers1

0

I was able to achieve what I needed, may not be the best possible solution but certainly what I was looking for now.

import sys, getopt
ifile=''
ofile=''
format=''

#get argument list using sys module
myopts, args = getopt.getopt(sys.argv[1:],"i:o:f")

for o,a in myopts:
            if o == '-i':
                        ifile=a
            elif o == '-o':
                        ofile=a
            elif o == '-f':
                        format=a
            else:
                        print("Usage: %s -i input -o output -f format" % sys.argv[0])

#Reset the output file for each run
reset = open(ofile,"w+")
reset.close()

#Read CSV in a ordered Column Format & output in JSON format

from collections import OrderedDict
import csv
import json
import os
with open(ifile,'r') as f:
    reader = csv.reader(f,delimiter=',', quotechar='"')
    headerlist = next(reader)
    for row in reader:
            d = OrderedDict()
            for i, x in enumerate(row):
                    print x
                    d[headerlist[i]] = x
            with open(ofile,'a') as m:
               if format == "pretty":
                    m.write(json.dumps(d, sort_keys=False, indent=4, separators=(',', ': '),encoding="utf-8",ensure_ascii=False))
                    m.write(',\n')
               else:
                    m.write(json.dumps(d))
                    m.write(',\n')


#Module to remove the trailing delimiter

file = open(ofile, "r+")
file.seek(0, os.SEEK_END)
pos = file.tell() - 1
while pos > 0 and file.read(1) != ",":
     pos -= 1
     file.seek(pos, os.SEEK_SET)


if pos > 0:
     file.seek(pos, os.SEEK_SET)
     file.truncate()
file.writelines('\n')
file.close()
jb04
  • 79
  • 9