0

I'm trying to convert a JSON file to CSV format (in memory), so that I can pass it to another Transformer in Mulesoft. Here is a snippet of the JSON:

[
{
"observationid": 1,
"fkey_observation": 1,
"value": 1,
"participantid": null,
"uom": "ppb",
"finishtime": 1008585047000,
"starttime": 1008581447000,
"observedproperty": "NO2",
"measuretime": 1008581567000,
"measurementid": 1,
"longitude": 3.1415,
"identifier": "Test-1",
"latitude": 10
},
{
"observationid": 1,
"fkey_observation": 1,
"value": 12,
"participantid": null,
"uom": "ppb",
"finishtime": 1008585047000,
"starttime": 1008581447000,
"observedproperty": "SO2",
"measuretime": 1008582047000,
"measurementid": 2,
"longitude": 5,
"identifier": "Test-1",
"latitude": 11
}
]

Essentially, this should create a CSV (in memory) with 2 rows, that looks like this:

1,1,1,N,ppb,1008585047000,1008581447000,NO2,1008581567000,1,3.1415,Test-1,10
1,1,12,N,ppb,1008585047000,1008581447000,SO2,1008582047000,2,5,Test-1,11

Currently, the output comes out like this, which is wrong:

[1  1   1    None    u'ppb'  1008585047000L  1008581447000L  u'NO2'  1008581567000L 1   3.1415   u'Test-1'   10]
[1  1   12   None    u'ppb'  1008585047000L  1008581447000L  u'SO2'  1008582047000L 2   5    u'Test-1'   11]

I believe the 'u' bit refers to Unicode, but I don't know how to change the encoding. Any help would be greatly appreciated!

Here is the Python code I have so far:

import json
import cStringIO

f = open('test.json')
data = json.load(f) 
f.close()

output = cStringIO.StringIO()
for item in data:
    output.write(str([item['observationid'], item['fkey_observation'],     item['value'], item['participantid'], item['uom'], item['finishtime'], item['starttime'], item['observedproperty'], item['measuretime'],item['measurementid'], item['longitude'], item['identifier'], item['latitude']]) + '\n')

contents = output.getvalue()
print contents`

EDIT

Hi guys, slight change of plan. Essentially, I have a String object, but it actually is structured like a JSON file:

"[{observationid=1, fkey_observation=1, value=1, participantid=null,   uom=ppb, finishtime=2001-12-17 10:30:47.0, starttime=2001-12-17 09:30:47.0, observedproperty=NO2, measuretime=2001-12-17 09:32:47.0, measurementid=1, longitude=3.1415, identifier=CITISENSE-Test-00000001, latitude=10}, {observationid=1, fkey_observation=1, value=12, participantid=null, uom=ppb, finishtime=2001-12-17 10:30:47.0, starttime=2001-12-17 09:30:47.0, observedproperty=SO2, measuretime=2001-12-17 09:40:47.0, measurementid=2, longitude=5, identifier=CITISENSE-Test-00000001, latitude=11}, {observationid=1, fkey_observation=1, value=7000, participantid=null, uom=ppb, finishtime=2001-12-17 10:30:47.0, starttime=2001-12-17 09:30:47.0, observedproperty=NO2, measuretime=2001-12-17 09:52:47.0, measurementid=3, longitude=6, identifier=CITISENSE-Test-00000001, latitude=9}, {observationid=2, fkey_observation=2, value=5, participantid=null, uom=ppb, finishtime=2001-12-18 10:30:47.0, starttime=2001-12-18 09:30:47.0, observedproperty=SO2, measuretime=2001-12-18 09:32:47.0, measurementid=4, longitude=7, identifier=CITISENSE-Test-00000001, latitude=8}, {observationid=2, fkey_observation=2, value=6, participantid=null, uom=ppb, finishtime=2001-12-18 10:30:47.0, starttime=2001-12-18 09:30:47.0, observedproperty=PM10, measuretime=2001-12-18 09:34:47.0, measurementid=5, longitude=8, identifier=CITISENSE-Test-00000001, latitude=10}, {observationid=3, fkey_observation=3, value=10000, participantid=null, uom=ppb, finishtime=2001-12-19 10:30:47.0, starttime=2001-12-19 09:30:47.0, observedproperty=SO2, measuretime=2001-12-19 09:38:47.0, measurementid=6, longitude=9,  identifier=CITISENSE-Test-00000001, latitude=11.2}]"

How do I go about converting this to CSV? I can't use the json module as it is not a JSON file.

Jono_B
  • 11
  • 4

3 Answers3

1

Here is my approach: use csv.DictWriter to handle converting from a dictionary to a row of CSV data:

import csv
import json
from cStringIO import StringIO

with open('test.json') as f:
    my_data = json.load(f)
    headers = [
        'observationid', 'fkey_observation', 'value',
        'participantid', 'uom', 'finishtime', 'starttime',
        'observedproperty', 'measuretime', 'measurementid',
        'longitude', 'identifier', 'latitude']

    buffer = StringIO()
    writer = csv.DictWriter(buffer, headers)

    for row in my_data:
        writer.writerow(row)
    print buffer.getvalue()
Ereli
  • 965
  • 20
  • 34
Hai Vu
  • 37,849
  • 11
  • 66
  • 93
0

You should probably consider using something like csvwriter. it will handle the escaping and delimiter setting for you.

See example for python3:

import csv 
with open('output.csv', 'w', newline='') as csvfile:
    writer = csv.writer(csvfile, delimiter=',')
    for line in data:
        writer.writerow(line)

it can also be used with cStringIO.

Ereli
  • 965
  • 20
  • 34
  • 1
    This is working for me, thanks so much @Ereli, and everyone else! – Jono_B Aug 11 '15 at 23:08
  • @Jono_b If you found one of the answers satisfactory, consider [accepting](https://stackoverflow.com/help/accepted-answer) it. – Ereli Jul 15 '17 at 13:24
0

Here's a little snippet I wrote up, I think it should handle your scenario and give you a list of lists. Ereli is onto something with that module though, it might make your life easier. But in the meantime maybe this will help.

import json
myFile =  open('myJson.json','r+')
myData = json.load(myFile)
myFile.close()

myList = []
for x in range(0,len(myData)):
    myList.append([])
    for key in myData[x].keys():
        value = myData[x][key]
        if isinstance(value,(str,unicode)):
            value = value.encode('ascii','ignore')
        myList[x].append(value)

print myList
Tony Tyrrell
  • 131
  • 1
  • 7