0

So I have a JSON file I uploaded as a list to python like this:

import pandas as pd
import json

with open('data.json') as json_file:      
    json_file = json_file.readlines()
    json_file = list(map(json.loads, json_file))

I want to return all rows with addresses that don't contain a number (return the entire row). The code I wrote is this, but I keep getting an error. (I used try and except because some lines dont have addresses and I didnt want the code to skip them):

for i in range (0, len(json_file)):
    try: 
        for line in json_file:
            add = json_file[i]['payload']['address']
            addresses = add.split(" ")
        try:
            address = int(addresses[0])
            if type(address) =! int: 
                print(line)

        except: 
            continue

    except:
        continue

For reference, this is what one json looks like in the file Im working with:

{
"payload": {
    "existence_full": 1,
    "geo_virtual": [
        "50.794876|-1.090893|20|within_50m|4"
    ],
    "latitude": "50.794876",
    "locality": "Portsmouth",
    "_records_touched": {
        "crawl": 16,
        "lssi": 0,
        "polygon_centroid": 0,
        "geocoder": 0,
        "user_submission": 0,
        "tdc": 0,
        "gov": 0
    },
    "email": "info.centre@port.ac.uk",
    "existence_ml": 0.9794948816203205,
    "address": "Winston Churchill Av",
    "longitude": "-1.090893",
    "domain_aggregate": "",
    "name": "University of Portsmouth",
    "search_tags": [
        "The University of Portsmouth",
        "The University of Portsmouth Students Union",
        "University House"
    ],
    "admin_region": "England",
    "existence": 1,
    "post_town": "Portsmouth",
    "category_labels": [
        [
            "Community and Government",
            "Education",
            "Colleges and Universities"
        ]
    ],
    "region": "Hampshire",
    "review_count": "1",
    "geocode_level": "within_50m",
    "tel": "023 9284 8484",
    "placerank": 42,
    "placerank_ml": 69.2774043602657,
    "address_extended": "Unit 4",
    "category_ids_text_search": "",
    "fax": "023 9284 3122",
    "website": "http: //www.port.ac.uk",
    "status": "1",
    "neighborhood": [
        "The Waterfront"
    ],
    "geocode_confidence": "20",
    "postcode": "PO1 2UP",
    "category_ids": [
        29
    ],
    "country": "gb",
    "_geocode_quality": "4"
},
"uuid": "297fa2bf-7915-4252-9a55-96a0d44e358e"
}
rjv
  • 6,058
  • 5
  • 27
  • 49
  • `if type(address) =! int: ` is a syntax error – Jean-François Fabre Jun 10 '18 at 18:07
  • 1
    You're casting to an int `address = int(addresses[0])` then checking if it is an int. This is either going to validly cast to an int or throw an error right? So this block `if type(address) =! int:` never executes considering you just swallow the error and `continue` if the typecast fails. Also as above `=!` is a syntax error. – Lane Terry Jun 10 '18 at 18:07
  • never wrap with bare `try/except` statements. – Jean-François Fabre Jun 10 '18 at 18:08
  • do you have any suggestions for how to check if its not numerical then to print that line? I thought doing != would check – gaucho_1789 Jun 10 '18 at 18:10
  • See the post linked by Jean after marking yours as duplicate. That would check, but like I mentioned you cast everything to an int and wrap that in a try/except block before checking with `!=`. Which means everything evaluated in that conditional is already guaranteed to be an int. – Lane Terry Jun 10 '18 at 18:11
  • My issue is only the first 2 numbers of the address would contain a number not the whole line, so I need to check if the first element is numerical. I wrote a new code: – gaucho_1789 Jun 10 '18 at 18:14
  • for i in range (0, len(json_file)): try: for line in json_file: add = json_file[i]['payload']['address'] addresses = add.split(" ") address = addresses[0] try: for address in addresses: if type(address) == int: print(line) except: continue except: continue – gaucho_1789 Jun 10 '18 at 18:15
  • A. That is unreadable. B. If you just try to typecast a string to int it will throw a `ValueError`. If you aren't thrown a value error, that first char is an int. – Lane Terry Jun 10 '18 at 18:19
  • my issue is all of the elements of the 'address' component are strings even if theyre numeric, thats why I used the split option to check the first char – gaucho_1789 Jun 10 '18 at 22:14

0 Answers0