1

I am looping over a json file with customer information in it, the purpose of the loop is to reformat the information in a certain structure and create another json containing the newly formatted information.

Depending on who is using the script, sometimes the json file which is looped over will be missing fields. I want the program to ignore the KeyError and instead fill it with an empty string and continue appending the other fields which are present.

This is how I have written the loop:

data = json.load(open('raw_data.json', 'r'))
customer_model = []

for row in data:
    try:
        customer_model.append({"RequestID": row['request_id'],
                               "Timestamp": n1 + "Z",
                               "ExternalID": row['external_id'],
                               "Fields": {
                                   "forename": row['forename'],
                                   "middle_name_1": row['middle_name_1'],
                                   "middle_name_2": row['middle_name_2'],
                                   "surname": row['surname'],
                                   "email": row['email'],
                                   "date_of_birth": row['date_of_birth'],
                                   "home_phone_number": row['home_phone_number'],
                                   "mobile_phone_number": row['mobile_phone_number'],
                                   "passport_number": str(row['passport_number']),
                                   "driving_licence": str(row['driving_licence']),
                               },
                               "Match": row['match']})
    except KeyError:
        ""
        continue

with open("customers.json", "w") as f:
    json.dump(customer_model, f, indent=4)

What this returns however is an empty json file, when I remove the try/except block I get a KeyError when there is a missing field in raw_data.json.

Is there something I am missing?

Edit: I want customers.json to look like this if raw_data.json is missing fields

    [
        {
            "RequestID": "",
            "Timestamp": "2022-08-27T07:59:30.34Z",
            "ExternalID": "18e452a7-29e5-4ad3-baeb-f439e48f4d0c",
            "Fields": {
                "forename": "Vennie",
                "middle_name_1": "Takisha",
                "middle_name_2": "Ebonie",
                "surname": "Castro",
                "email": "bemar1973@yandex.com",
                "date_of_birth": "",
                "home_phone_number": "016977 0528",
                "mobile_phone_number": "056 5567 8799",
                "passport_number": "",
                "driving_licence": "",
            },
            "Match": false
        },
   ]
nocnoc
  • 337
  • 2
  • 12
  • 1
    Please give an example of what you want `customer_model` to look like if you are handling two rows of `data` and neither has all the keys present which should result in empty strings. – quamrana Aug 27 '22 at 11:13
  • 2
    Use row.get(). And set default value if key is missing. See example here: https://stackoverflow.com/a/11041421/4720957 – user47 Aug 27 '22 at 11:14
  • @quamrana I have updated the original post to include what I would like the model to look like – nocnoc Aug 27 '22 at 11:22
  • 1
    Sounds like you need to use `row.get(...)` for each of the accesses to `row`. – quamrana Aug 27 '22 at 11:23

1 Answers1

1

As some in the comments have pointed out, the solution was to use row.get() instead of a try/except block. The benefit is that, in row.get() you can provide a default value if a key is missing.

for row in data:
    customer_model.append({"RequestID": row.get("request_id", ""),
                           "Timestamp": n1 + "Z",
                           "ExternalID": row.get('external_id', ""),
                           "Fields": {
                               "forename": row['forename'],
                               "middle_name_1": row.get('middle_name_1', ""),
                               "middle_name_2": row.get('middle_name_2', ""),
                               "surname": row.get('surname', ""),
                               "email": row.get('email', ""),
                               "date_of_birth": row.get('date_of_birth', ""),
                               "home_phone_number": row.get('home_phone_number', ""),
                               "mobile_phone_number": row.get('mobile_phone_number', ""),
                               "passport_number": str(row.get('passport_number', "")),
                               "driving_licence": str(row.get('driving_licence', "")),                    
                           },
                           "Match": row.get('match', False)})

This is the working version of the code

nocnoc
  • 337
  • 2
  • 12