0

Json Format:

[{"studios": [{"studioId": 539}, {"studioId": 540}], 
"id": 843, 
"title": "In the Mood for Love",
"crew": [{"personId": 12453, "department": "Directing", "job": "Director"}, {"personId": 12454, "department": "Sound", "job": "Music"}, {"personId": 12455, "department": "Sound", "job": "Original Music Composer"}, {"personId": 1357, "department": "Camera", "job": "Director of Photography"}, {"personId": 12453, "department": "Writing", "job": "Screenplay"}, {"personId": 12453, "department": "Production", "job": "Producer"}, {"personId": 21909, "department": "Production", "job": "Executive Producer"}, {"personId": 45818, "department": "Editing", "job": "Editor"}, {"personId": 232804, "department": "Camera", "job": "Director of Photography"}, {"personId": 12667, "department": "Camera", "job": "Director of Photography"}],
"releases": [{"releasedate": "2000-11-08", "country": "FR"}, {"releasedate": "2000-05-22", "country": "US"}]
"genres": ["Drama", "Romance"], 
"releasedate": "2000-05-22", 
"cast": [{"personId": 1337, "character": "Chow Mo-Wan", "order": 0}, {"personId": 1338, "character": "Su Li-Zhen", "order": 1}, {"personId": 12674, "character": "Ah Ping", "order": 2}, {"personId": 12462, "character": "Mrs. Suen", "order": 4}, {"personId": 12463, "character": "Mr. Ho", "order": 5}, {"personId": 12464, "character": "", "order": 6}, {"personId": 12465, "character": "", "order": 7}, {"personId": 12466, "character": "Mr. Chan", "order": 8}, {"personId": 12467, "character": "The Amah", "order": 9}, {"personId": 12468, "character": "", "order": 10}, {"personId": 12469, "character": "", "order": 11}, {"personId": 12470, "character": "Mrs. Chow", "order": 12}], 
"userrating": 7.6}]

I am trying to convert it to .csv file. But I am getting errors. I want the csv file to be normalized in 1NF so that I can directly transfer to some database

MyCode:

import json
import csv
with open("data3.json") as json_file, open("data3.csv", "w",encoding='utf-8') as csv_file:
    csv_file = csv.writer(csv_file)
    a = json.load(json_file)
    csv_file.writerow(["StudiosId", "Id", "Title","personId","Department","Job","ReleaseDate","PosterLink","Genres","Cast","Runtime"])
    for item in a:
        csv_file.writerow([item["studios"], item["id"], item["title"],item["crew"][0],item["crew"][1],item["crew"][2],item["poster"],item["genres"],item["releasedate"],item["cast"],item["runtime"]])

Error:

Traceback (most recent call last):
File "<stdin>", line 7, in <module>
IndexError: list index out of range
tzaman
  • 46,925
  • 11
  • 90
  • 115
Blank
  • 81
  • 11
  • The problem is that you're using Python 2 which doesn't support the `encoding` parameter (it's new in Python 3). Either install Python 3 or take a look at http://stackoverflow.com/questions/10971033/backporting-python-3-openencoding-utf-8-to-python-2. – maxymoo Oct 26 '15 at 02:14
  • I tried with Python 3.5.0 too. It gives the following error: Traceback (most recent call last): File "", line 7, in TypeError: list indices must be integers or slices, not str – Blank Oct 26 '15 at 02:25

1 Answers1

0

You're JSON structure is incorrect (missing comma), the code looks ok. Try this instead:

[{"studios": [{"studioId": 539}, {"studioId": 540}], 
"id": 843, 
"title": "In the Mood for Love",
"crew": [{"personId": 12453, "department": "Directing", "job": "Director"}, {"personId": 12454, "department": "Sound", "job": "Music"}, {"personId": 12455, "department": "Sound", "job": "Original Music Composer"}, {"personId": 1357, "department": "Camera", "job": "Director of Photography"}, {"personId": 12453, "department": "Writing", "job": "Screenplay"}, {"personId": 12453, "department": "Production", "job": "Producer"}, {"personId": 21909, "department": "Production", "job": "Executive Producer"}, {"personId": 45818, "department": "Editing", "job": "Editor"}, {"personId": 232804, "department": "Camera", "job": "Director of Photography"}, {"personId": 12667, "department": "Camera", "job": "Director of Photography"}],
"releases": [{"releasedate": "2000-11-08", "country": "FR"}, {"releasedate": "2000-05-22", "country": "US"}],
"genres": ["Drama", "Romance"], 
"releasedate": "2000-05-22", 
"cast": [{"personId": 1337, "character": "Chow Mo-Wan", "order": 0}, {"personId": 1338, "character": "Su Li-Zhen", "order": 1}, {"personId": 12674, "character": "Ah Ping", "order": 2}, {"personId": 12462, "character": "Mrs. Suen", "order": 4}, {"personId": 12463, "character": "Mr. Ho", "order": 5}, {"personId": 12464, "character": "", "order": 6}, {"personId": 12465, "character": "", "order": 7}, {"personId": 12466, "character": "Mr. Chan", "order": 8}, {"personId": 12467, "character": "The Amah", "order": 9}, {"personId": 12468, "character": "", "order": 10}, {"personId": 12469, "character": "", "order": 11}, {"personId": 12470, "character": "Mrs. Chow", "order": 12}], 
"userrating": 7.6}]

There is also no poster or runtime keys, in the sample you provided.

EDIT update.

import csv
import json

with open('data.json', 'r') as f, open("data3.csv", "w") as csv_file:
    csv_file = csv.writer(csv_file)
    a = json.load(f)
    csv_file.writerow(["StudiosId", "Id", "Title","personId","Department","Job","ReleaseDate","PosterLink","Genres","Cast","Runtime"])
    for item in a:
        csv_file.writerow([item["studios"], item["id"], item["title"],item["crew"][0]['personId'],item["crew"][0]['department'],item["crew"][0]['job'],item["genres"],item["releasedate"],item["cast"]])
ajsp
  • 2,512
  • 22
  • 34
  • It is not able to separate "crew" and "cast" as they have mutiple values. – Blank Oct 26 '15 at 04:01
  • @Blank updated, this works fine on my end, but if not post a more complete example. – ajsp Oct 26 '15 at 04:22
  • I want the crew data in 3 different table, First being the personId,second being the Job, and lastly the Department. Now it is just separating the director, music ,writer etc. – Blank Oct 26 '15 at 05:10
  • @Blank Then you will need to rethink the method you are using, and that has nothing to do with the question asked. Why don't you go straight to database? It would save you an awful lot of trouble. – ajsp Oct 26 '15 at 05:20
  • I was using this approach as I felt importing csv file into any database will be lot simpler. – Blank Oct 26 '15 at 05:22
  • @Blank ok I'll have another look at it. – ajsp Oct 26 '15 at 05:27
  • @Blank ok there you go, you'll still have to update the missing keys. – ajsp Oct 26 '15 at 05:47