Convert Json to CSV using Python

Question

Below, is the json structure I am pulling from my online weather station. I am also including a json_to_csv python script that is supposed to convert json data to csv output, but only returns a "Key" error. I want to pull data from "current_observation": only.

{
  "response": {
  "features": {
  "conditions": 1
  }
    }
  , "current_observation": {
        "display_location": {
        "latitude":"40.466442",
        "longitude":"-85.362709",
        "elevation":"280.4"
        },
        "observation_time_rfc822":"Fri, 26 Jan 2018 09:40:16 -0500",
        "local_time_rfc822":"Sun, 28 Jan 2018 11:22:47 -0500",
        "local_epoch":"1517156567",
        "local_tz_short":"EST",
        "weather":"Clear",
        "temperature_string":"44.6 F (7.0 C)",
    }
}



import csv, json, sys
inputFile = open("pywu.cache.json", 'r') #open json file
outputFile = open("CurrentObs.csv", 'w') #load csv file
data = json.load(inputFile) #load json content 
inputFile.close() #close the input file
output = csv.writer(outputFile) #create a csv.write
output.writerow(data[0].keys())
for row in data:
    output = csv.writer(outputFile) #create a csv.write 
    output.writerow(data[0].keys())
for row in data:
    output.writerow(row.values()) #values row

What's the best method to retrieve the temperature string and convert to .csv format? Thank you!

Your code is not properly indented; please fix it so we can run it. — Adam Barnes, Feb 01 '18 at 17:38
So the output you want is a CSV file with rows of just a single column, each showing only temperature_string? — NFB, Feb 01 '18 at 17:40
I would like to have local_time, weather and temperature_string, if possible. — user5834454, Feb 01 '18 at 17:52

habet · Accepted Answer · 2018-02-06T17:15:45.730

2

import pandas as pd
df = pd.read_json("pywu.cache.json")
df = df.loc[["local_time_rfc822", "weather", "temperature_string"],"current_observation"].T
df.to_csv("pywu.cache.csv")

maybe pandas can be of help for you. the .read_json() function creates a nice dataframe, from which you can easily choose the desired rows and columns. and it can save as csv as well.

to add latitude and longitude to the csv-line, you can do this:

df = pd.read_json("pywu.cache.csv")
df = df.loc[["local_time_rfc822", "weather", "temperature_string", "display_location"],"current_observation"].T
df = df.append(pd.Series([df["display_location"]["latitude"], df["display_location"]["longitude"]], index=["latitude", "longitude"]))
df = df.drop("display_location")
df.to_csv("pywu.cache.csv")

to print the location in numeric values, you can do this:

df = pd.to_numeric(df, errors="ignore")
print(df['latitude'], df['longitude'])

edited Feb 06 '18 at 17:15

answered Feb 03 '18 at 10:08

habet

131
10

How would you go about parsing latitude/ longitude using pandas and writing a single row output in CSV format? – user5834454 Feb 05 '18 at 15:17
not sure, what you mean with parsing exactly. adding latitutude and longitude can be done like this for example (i will add it to the answer) – habet Feb 06 '18 at 16:28
Thanks for amending the code. By parsing, I mean extracting both latitude and longitude from the string "display_location". Not sure if it's possible to just display the numerical value for each? – user5834454 Feb 06 '18 at 16:51
sure, you can convert the resulting pd.Series() object with pd.to_numeric to a numeric values. with errors="ignore" you skip the non-numeric values. Then you can index the object just like a dict. (see code in answer). and sorry for all the editing. – habet Feb 06 '18 at 17:20
NP with all the edits - it helps better understand were code is placed, within pandas. With your most recent additions, do I add pd.to_numeric directly after the pd.Series and before df.drop? – user5834454 Feb 06 '18 at 18:02
yes, but it should not matter if you do the pd.to_numeric before or after the drop – habet Feb 06 '18 at 18:55
Thanks! The only reason I ask, is I get a "KeyError" for latitude when running the script. – user5834454 Feb 06 '18 at 20:16
i made a typo in "index" keyword and wrote "latitute" instead of "latitude". i have corrected that, but did you maybe copy the typo? – habet Feb 07 '18 at 15:05
Perfect - didn't catch that the first time. Lastly, how would one go about writing data to a row instead of two columns in a csv? Thank you again for all the help - this has been very useful! – user5834454 Feb 07 '18 at 17:46
you are welcome ! if you want all values in one row and without any description you can change the line with the .to_csv() to: pd.DataFrame(df).T.to_csv("pywu.cache.csv", index=False, header=False, sep=",") – habet Feb 07 '18 at 22:16
How would I go about retrieving additional data? I tried adding some fields to the above script and got a syntax error. Thank you! – user5834454 Feb 16 '18 at 19:06
can you be more precise about what you were trying to achieve and what syntax you wrote for that? adding additional data from the "current_observation" except elevation (for example local_tz_short) should be possible by just expanding the list in the second line (df = df.loc[[THIS_LIST], "current_observation"].T) – habet Feb 19 '18 at 12:47
It was my fault. I added a bunch of fields to the list and forgot a pair of quotes. Thx again! – user5834454 Feb 19 '18 at 18:00
Hi @habet! Is it possible to add additional JSON files to this script, returning the same fields? Instead of a single row, each row would reflect the output of a different JSON (Row 1 - JSON 1, Row 2 - JSON 2, etc.). I will gladly add this as a new question, but didn't know how to reach-out to you, specifically for help. Thank you! – user5834454 Feb 21 '18 at 18:23
sure, just iterate over the files and do all the lines above and including the one with the df.append once for each JSON. – habet Feb 27 '18 at 11:42
Thanks for the above, it was helpful! However, I got an error when not providing an index:
`ValueError: If using all scalar values, you must pass an index`.
Once I added one this worked.
import pandas as pd df = pd.read_json("exampleJSON.json",orient='index') df = df.loc[["field1", "field2", "field3"]].T df.to_csv("exampleCSV.csv") – yrg May 26 '22 at 15:58

score 1 · Answer 2 · answered Feb 01 '18 at 18:06

This will find all keys (e.g. "temperature_string") specified inside of the json blob and then write them to a csv file. You can modify this code to get multiple keys.

import csv, json, sys

def find_deep_value(d, key):
# Find a the value of keys hidden within a dict[dict[...]]
# Modified from https://stackoverflow.com/questions/9807634/find-all-occurrences-of-a-key-in-nested-python-dictionaries-and-lists
# @param d dictionary to search through
# @param key to find

    if key in d:
        yield d[key]
    for k in d.keys():
        if isinstance(d[k], dict):
            for j in find_deep_value(d[k], key):
                yield j

inputFile = open("pywu.cache.json", 'r')  # open json file
outputFile = open("mypws.csv", 'w')  # load csv file
data = json.load(inputFile)  # load json content
inputFile.close()  # close the input file
output = csv.writer(outputFile)  # create a csv.write

# Gives you a list of temperature_strings from within the json
temps = list(find_deep_value(data, "temperature_string"))
output.writerow(temps)
outputFile.close()

Convert Json to CSV using Python

2 Answers2

Linked