-2

I'm building my first web scraper and have the following returning the info I need -

import requests
r = requests.get('https://greatbritishpublictoiletmap.rca.ac.uk/loos/54c234f02ec4abe957b84f37?format=json')
r.json()["properties"]

the output looks like this -

{'orig': {'amenity': 'toilets', 'source': 'survey', 'id': 'node/975749026', 'location': 'Victoria Street', 'postcode': 'DE1 1EQ', 'open': '07.00 - 19.00', 'closed': '19.00 - 07.00', 'male': '1', 'female': '1', 'unisex': '0', 'disabled': '1', 'radar': 'Yes', 'baby change': '0', 'cost': '�0.20', 'date': '15/04/2014', 'data collected from': 'FOI', 'geocoded': True, 'geocoding_method': 'postcode'}, 'geocoding_method': 'postcode', 'geocoded': True, 'fee': '�0.20', 'babyChange': 'false', 'radar': 'true', 'type': 'female and male', 'opening': '07:00-19:00', 'postcode': 'DE1 1EQ', 'name': 'Victoria Street', 'streetAddress': 'Victoria Street', 'accessibleType': '', 'notes': '', 'area': [{'type': 'Unitary Authority', 'name': 'Derby City Council', '_id': '57f268ed87986b0010177619'}], 'access': 'public', 'active': True}

I simply want to dump this information into a CSV but I'm struggling to adjust my code. How do I do this?

roganjosh
  • 12,594
  • 4
  • 29
  • 46
Dan
  • 11
  • 3
    You can't adjust code that doesn't exist. How are you trying to write to a CSV currently? Can you please edit your question to show what you're currently using? – roganjosh May 16 '17 at 19:14
  • 1
    Possible duplicate of [How can I convert JSON to CSV?](http://stackoverflow.com/questions/1871524/how-can-i-convert-json-to-csv) – david.barkhuizen May 16 '17 at 19:23
  • The truth is that i'm learning as i go, i've tried writing to CSV but couldn't get it to work so didn't include the half baked code, i was delighted to get this far as my first attempt. Just after a pointer or 2 nothing major. – Dan May 16 '17 at 20:04

1 Answers1

0

You really need to decide which data you want, and how you want it formatted as the JSON is nested. The following approach for example simply writes all entries that are either strings or boolean values:

import requests
import csv

r = requests.get('https://greatbritishpublictoiletmap.rca.ac.uk/loos/54c234f02ec4abe957b84f37?format=json')
properties = r.json()["properties"]

with open('output.csv', 'w', newline='', encoding='utf-8') as f_output:
    csv_output = csv.writer(f_output)
    header = []

    for k, v in properties.items():
        if isinstance(v, str) or isinstance(v, bool):
            header.append(k)

    csv_output.writerow(header)
    csv_output.writerow([properties[k] for k in header])

This would give you a CSV as follows:

opening,name,radar,active,accessibleType,type,postcode,geocoding_method,babyChange,notes,fee,geocoded,access,streetAddress
07:00-19:00,Victoria Street,true,True,,female and male,DE1 1EQ,postcode,false,,�0.20,True,public,Victoria Street

Tested using Python 3.5.2

Martin Evans
  • 45,791
  • 17
  • 81
  • 97