6

I am new to python, and I am having to convert a csv file to json in following format:

CSV File :

firstname, lastname, email, customerid, dateadded, customerstatus
john, doe, john.doe@do.com, 124,26/11/18,active
jane, doe, jane.doe@do.com, 125,26/11/18,active

JSON format:

{
    firstname: "John",
    lastname: "Doe",
    emailOrPhone: "john.doe@do.com",
    extraFields: [{
            name: "customerid",
            value: "124"
        },
        {
            name: "dateadded",
            value: "26/11/18"
        },
        {
            name: "dateadded",
            value: "26/11/18"
        }
    ]
}, {
    firstname: "Jane",
    lastname: "Doe",
    emailOrPhone: "Jane.doe@do.com",
    extraFields: [{
            name: "customerid",
            value: "125"
        },
        {
            name: "dateadded",
            value: "26/11/18"
        },
        {
            name: "dateadded",
            value: "26/11/18"
        }
    ]
}


current code I am using:
import requests
import json
import time
import csv
import json
import glob
import os
import logging


for filename in glob.glob('D:\\api\\Extract.csv'):
    csvfile = os.path.splitext(filename)[0]
    jsonfile = csvfile + '.json'

    with open(csvfile+'.csv') as f:
        reader = csv.DictReader(f)
        rows = list(reader)

    with open(jsonfile, 'w') as f:
        json.dump(rows, f)

url = 'api_url'

with open("D:\\api\\Extract.json", "r") as read_file:
    data = json.load(read_file)

    for item in data:


        headers = {"Authorization" : "key", "Content-Type" : "application/json"}

        r = requests.post(url, data= json.dumps(item), headers= headers)



        logging.basicConfig(level=logging.DEBUG,
                    format='%(asctime)s %(message)s',
                    handlers=[logging.FileHandler("D:\\api\\log_file.log"),
                              logging.StreamHandler()])

I can produce parent values in json, but I am not sure how do I get sub-nodes and parse column name as values and iterate through entire file like that. Above code converts csv to simple json objects, I want to achieve nested objects. I am thinking maybe appending would be the solution, but not sure how to pass column as value and corresponding data as value.

RoadRunner
  • 25,803
  • 6
  • 42
  • 75
jdag
  • 149
  • 3
  • 11
  • Can you share your code so that we can update it from there – Krishna Nov 26 '18 at 02:55
  • Hi I have added my code in the question above. That might not help as I am converting csv to json straight away. – jdag Nov 26 '18 at 03:19
  • Possible duplicate of [How to convert CSV to JSON?](https://stackoverflow.com/questions/46091769/how-to-convert-csv-to-json) – SuperShoot Nov 26 '18 at 03:20

3 Answers3

6

You can use csv.DictReader which gives you access to the column name as you're iterating each row. Then you can build each item as follows:

import json
import csv

primary_fields = ['firstname', 'lastname', 'email']
result = []
with open('mydata.csv') as csv_file:
    reader = csv.DictReader(csv_file, skipinitialspace=True)
    for row in reader:
        d = {k: v for k, v in row.items() if k in primary_fields}
        d['extraFields'] = [{'name': k, 'value': v} for k, v in row.items() if k not in primary_fields]
        result.append(d)

print(json.dumps(result, indent=2))

Output

[
  {
    "firstname": "john",
    "lastname": "doe",
    "email": "john.doe@do.com",
    "extraFields": [
      {
        "name": "customerid",
        "value": "124"
      },
      {
        "name": "dateadded",
        "value": "26/11/18"
      },
      {
        "name": "customerstatus",
        "value": "active"
      }
    ]
  },
  {
    "firstname": "jane",
    "lastname": "doe",
    "email": "jane.doe@do.com",
    "extraFields": [
      {
        "name": "customerid",
        "value": "125"
      },
      {
        "name": "dateadded",
        "value": "26/11/18"
      },
      {
        "name": "customerstatus",
        "value": "active"
      }
    ]
  }
]

If you want to set custom field names in your final json (e.g. emailOrPhone for email), you can always manually set field names for d and set the appropriate value

slider
  • 12,810
  • 1
  • 26
  • 42
3

At little more complicated than needs to be, but you can try building your JSON array as you read in your values from the csv file, then output your result to a .json file with json.dump at the end:

from csv import reader
from json import dump

top_fields = ["firstname", "lastname", "email"]
extra_fields = ["customerid", "dateadded", "customerstatus"]

data = []
with open("customers.csv") as csv_in:
    csv_reader = reader(csv_in)

    # Get headers
    headers = list(map(str.strip, next(csv_reader)))

    for row in csv_reader:
        json_object = {}

        # Build dictionary for each row
        row_map = dict(zip(headers, map(str.strip, row)))

        # Add in top fields first
        for top in top_fields:
            json_object[top] = row_map[top]

        # Then add in extra fields
        for extra in extra_fields:
            json_object.setdefault("extraFields", []).append(
                {"name": extra, "value": row_map[extra]}
            )

        data.append(json_object)

with open("customers.json", "w") as fp:
    dump(data, fp, indent=4, sort_keys=True)

Which gives the following customers.json:

[
    {
        "email": "john.doe@do.com",
        "extraFields": [
            {
                "name": "customerid",
                "value": "124"
            },
            {
                "name": "dateadded",
                "value": "26/11/18"
            },
            {
                "name": "customerstatus",
                "value": "active"
            }
        ],
        "firstname": "john",
        "lastname": "doe"
    },
    {
        "email": "jane.doe@do.com",
        "extraFields": [
            {
                "name": "customerid",
                "value": "125"
            },
            {
                "name": "dateadded",
                "value": "26/11/18"
            },
            {
                "name": "customerstatus",
                "value": "active"
            }
        ],
        "firstname": "jane",
        "lastname": "doe"
    }
]
RoadRunner
  • 25,803
  • 6
  • 42
  • 75
0
import csv
import sys
import json

#EDIT THIS LIST WITH YOUR REQUIRED JSON KEY NAMES
fieldnames=["firstname","secondname","age"]

def convert(filename):
  csv_filename = filename[0]
  print "Opening CSV file: ",csv_filename 
  f=open(csv_filename, 'r')
  csv_reader = csv.DictReader(f,fieldnames)
  json_filename = csv_filename.split(".")[0]+".json"
  print "Saving JSON to file: ",json_filename
  jsonf = open(json_filename,'w') 
  data = json.dumps([r for r in csv_reader])
  jsonf.write(data) 
  f.close()
  jsonf.close()

if __name__=="__main__":
  convert(sys.argv[1:])

USAGE:

python csv2json.py myCSVfile.txt

where myCSVfile.txt it's your CSV file (name it as you prefer).

It will create a JSON array in a file named myCSVfile.json

That's all.

Snivio
  • 1,741
  • 1
  • 17
  • 25