I have a dataframe with 3 columns (INSTNR, Enhedsadresser, API_response), where the 3rd column (API_response) contains JSON objects. I would like to flatten the JSON object and store the extracted information in separate columns within the same df. I am particularly interested in extracting kategori, resultater -> adresse -> id, and resultater -> adresse -> adgangsadresseid information.
I have tried:
data = json_normalize(data=df['API_response'], record_path='resultater',
meta=['kategori'], errors='ignore')
but it simply returns TypeError: string indices must be integers
Whereas data = json_normalize(data=df['API_response'])
gave me a column with a list of indices...
How can I extract the needed information?
Example of a JSON object:
{
"kategori": "A",
"resultater": [
{
"adresse": {
"id": "0a3f50bc-f815-32b8-e044-0003ba298018",
"vejnavn": "Staldgaardsgade",
"adresseringsvejnavn": "Staldgaardsgade",
"husnr": "39A",
"supplerendebynavn": null,
"postnr": "7100",
"postnrnavn": "Vejle",
"status": 1,
"virkningstart": "2009-11-24T02:15:25.000Z",
"virkningslut": null,
"adgangsadresseid": "0a3f5090-edef-32b8-e044-0003ba298018",
"etage": "st",
"dør": "th",
"href": "https://api.dataforsyningen.dk/adresser/0a3f50bc-f815-32b8-e044-0003ba298018"
},
"aktueladresse": {
"id": "0a3f50bc-f815-32b8-e044-0003ba298018",
"vejnavn": "Staldgaardsgade",
"adresseringsvejnavn": "Staldgaardsgade",
"husnr": "39A",
"supplerendebynavn": null,
"postnr": "7100",
"postnrnavn": "Vejle",
"status": 1,
"virkningstart": "2009-11-24T02:15:25.000Z",
"virkningslut": null,
"adgangsadresseid": "0a3f5090-edef-32b8-e044-0003ba298018",
"etage": "st",
"dør": "th",
"href": "https://api.dataforsyningen.dk/adresser/0a3f50bc-f815-32b8-e044-0003ba298018"
},
"vaskeresultat": {
"variant": {
"vejnavn": "Staldgaardsgade",
"husnr": "39A",
"etage": "st",
"dør": "th",
"supplerendebynavn": null,
"postnr": "7100",
"postnrnavn": "Vejle"
},
"afstand": 0,
"forskelle": {
"vejnavn": 0,
"husnr": 0,
"postnr": 0,
"postnrnavn": 0,
"etage": 0,
"dør": 0
},
"parsetadresse": {
"vejnavn": "Staldgaardsgade",
"husnr": "39A",
"etage": "st",
"dør": "th",
"postnr": "7100",
"postnrnavn": "Vejle"
},
"ukendtetokens": [],
"anvendtstormodtagerpostnummer": null
}
}
]
}
Link to API response containing this JSON object: https://api.dataforsyningen.dk/datavask/adresser?betegnelse=Staldgaardsgade%2039A%20st%20th,%207100%20Vejle
EDIT 1
I created GitHub repo with data and python script: https://github.com/mantasbacys/TREFOR