I'm trying to iterate through a panda series of dictionaries (obtained through an API) and I want to split them into desperate series in a dataframe

Question

I'm banging my head against the wall. I'm working with the GeoDeepDive API, and trying to "tidy" the data. My code:

import requests
import pandas as pd

response = requests.get("https://geodeepdive.org/api/articles?pubname_like=Geochronology")

data = response.json()
df = pd.json_normalize(data)
df = df['success.data']
dic = df[0]

df1 = pd.DataFrame.from_dict(dic)
df1.to_csv("output_file.csv")

The result is almost perfect, but nested json's are causing an issue where I get data like this:

CSV output with top two rows edited manually

CSV top 2 rows are the output I want to have, and the bottom two are what I get. I need to somehow "dig" into the dictionaries.

I'm trying to iterate through those specific dataframe columns and either pick just the url, or turn the names, into a list of names. I keep getting errors with whatever I try though, and I'm thinking there may even be a better way to do this. Any thoughts?

Could you please provide an MRE ([How to create a Minimal, Reproducible Example](https://stackoverflow.com/help/minimal-reproducible-example), [minimal reproducible-(pandas) examples](https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples)) and be a bit more specific regarding your expected output. — Timus, Dec 14 '20 at 09:37

score 0 · Accepted Answer · answered Dec 14 '20 at 21:37

0

Could you try if this

...
df1 = pd.DataFrame.from_dict(dic)
df1.link = df1.link.apply(lambda l: l[0]['url'])
df1.author = df1.author.apply(lambda l: ';'.join(d['name'] for d in l))
df1.to_csv("output_file.csv")

fits your needs?

answered Dec 14 '20 at 21:37

Timus

10,974
5
14
28

1

Absolutely! Thanks very much! The code... irritatingly makes sense too. I get lost in python indexing. – Drew Heasman Dec 14 '20 at 22:01

I'm trying to iterate through a panda series of dictionaries (obtained through an API) and I want to split them into desperate series in a dataframe

1 Answers1