-1

I have around 800 json files in a folder , i want to convert all those files to its csv formats.And these must be done without specifying the names of the file.My json files are very nested so need an idea to convert it to its csv formats.

Sample of json file

[
{
'Extract_date': '2019-06-04'
},
{
'lat,lon': '0,0'
},
{
'current': [
{
    'source': 'sg',
    'value': 99.04
  },
  {
    'source': 'meto',
    'value': 99.04
  }
],
 'swell': [
  {
    'source': 'sg',
    'value': 192.28
  },
  {
    'source': 'noaa',
    'value': 201.69
  },

],
'Height': [
  {
    'source': 'sg',
    'value': 0.4
  },
  {
    'source': 'noaa',
    'value': 0.36
  },

],
'time': '2019-06-04T00:00:00+00:00',
'wind': [
  {
    'source': 'sg',
    'value': 153.43
  },
  {
    'source': 'noaa',
    'value': 156.89
  },
  {
    'source': 'icon',
    'value': 153.43
  }
],
'Speed': [
  {
    'source': 'sg',
    'value': 4.92
  },
  {
    'source': 'noaa',
    'value': 5.46
  },
  {
    'source': 'icon',
    'value': 4.92
 }
]}]

The code should retrieve the json files from the folder and then generate the csv files for the json files that is i need the flattened structure.

the code which i tried is just for one file but it displays the data as just one row.

import os
import pandas as pd
import json
import numpy as np
data = []
os.chdir('/home/Documents/04.06.2019')
with open('new.json', encoding="utf8") as data_file: 
for line in data_file:
data.append(json.loads(line))
dataframe = pd.DataFrame(data)        
dataframe.to_csv("filename.csv", encoding='utf-8',index= False)
redhima
  • 19
  • 1
  • 9
  • what did you try? Show your code. Show example of flattened structure. – furas Jun 06 '19 at 09:26
  • Please describe what you've done and where you're getting stuck. As written, it comes of like you're asking for someone to write a utility for you, which is completely off topic. – outis Jun 06 '19 at 09:27
  • What's desired output? Check jq and its @csv formatter – lojza Jun 06 '19 at 09:30
  • the code which i have tried is for just one file , but it gives me the data as just one row. – redhima Jun 06 '19 at 09:33
  • The desired output is json files must be converted to its csv format @lojza – redhima Jun 06 '19 at 09:39

1 Answers1

1

you can get access to your .json/.csv files like shown in this post: How can I convert JSON to CSV?

To proceed all files, you can iterate over every file in your folder like

import os

INPUT_PATH ="...\path\to\file"

for file in os.listdir(INPUT_PATH):
    if file.endswith('.json') or file.endswith('.JSON'):
        INPUT_FILE = os.path.join(INPUT_PATH, file)

        #File IO

_______EDIT_________

There are problems with your json format:

-Strings have to be surrounded by double quotes "str" instead of single quotes 'str'. To escape this problem you could replace the quotes of your loaded json string with

dataframe = json.loads(JSON_STRING.replace("'", '"'))

-At some lines (line 29, 40, of your json, there are too many commas, e.g.:

{
    'source': 'noaa',
    'value': 201.69
  },                   #<---!!!

],
'Height': [
  {
    'source': 'sg',
    'value': 0.4
  },

Are these json files written by a monitoring system? --> do they all have the same format?

In this case you could (at least) modify the formatting problems of your 600 files codewise..

Please check out the json module docs (https://docs.python.org/3/library/json.html) to get information about how to access your nested informations.

Mig B
  • 637
  • 1
  • 11
  • 19