0

I read the json data into dataframe and the first column has data in below format:

0     {'name': 'Mark Vande Hei', 'craft': 'ISS'}      10  success

1     {'name': 'Oleg Novitskiy', 'craft': 'ISS'}      10  success

How can I create a new dataframe with 2 columns: Name and craft from the above data?

url_crew = 'http://api.open-notify.org/astros.json'
crew = pd.read_json(url_crew)
print(crew)
kennyvh
  • 2,526
  • 1
  • 17
  • 26
Hary2
  • 51
  • 4
  • Your input is JSON, and you want to convert it to a dtaframe (properly, not just as one big string). There's no such thing as 'name' element of a dataframe. You're just referring to the structure of the JSON inside your dataframe's column. – smci Jul 06 '21 at 00:13

2 Answers2

1
    url_crew = 'http://api.open-notify.org/astros.json'
    crew = pd.read_json(url_crew)

>>> df = pd.concat([crew.drop(['people'], axis=1), crew['people'].apply(pd.Series)], axis=1)
>>> df = df[['name','craft']]
>>> df
              name     craft
0   Mark Vande Hei       ISS
1   Oleg Novitskiy       ISS
2     Pyotr Dubrov       ISS
3   Thomas Pesquet       ISS
4   Megan McArthur       ISS
5  Shane Kimbrough       ISS
6  Akihiko Hoshide       ISS
7     Nie Haisheng  Tiangong
8       Liu Boming  Tiangong
9      Tang Hongbo  Tiangong
Yiannis
  • 155
  • 1
  • 13
0

Pandas has a really handy utility function, pd.json_normalize to do this.

It accepts a list of dictionaries or a series of dictionaries.

url_crew = 'http://api.open-notify.org/astros.json'
crew = pd.read_json(url_crew)

df = pd.json_normalize(crew["people"])
print(df)

Output

              name     craft
0   Mark Vande Hei       ISS
1   Oleg Novitskiy       ISS
2     Pyotr Dubrov       ISS
3   Thomas Pesquet       ISS
4   Megan McArthur       ISS
5  Shane Kimbrough       ISS
6  Akihiko Hoshide       ISS
7     Nie Haisheng  Tiangong
8       Liu Boming  Tiangong
9      Tang Hongbo  Tiangong
kennyvh
  • 2,526
  • 1
  • 17
  • 26