Python JSON to dataframe

Question

i have json in format

   {
   "projects":[
      {
         "author":{
            "id":163,
            "name":"MyApp",
            "easy_external_id":null
         },
         "sum_time_entries":0,
         "sum_estimated_hours":29,
         "currency":"EUR",
         "custom_fields":[
            {
               "id":42,
               "name":"System",
               "internal_name":null,
               "field_format":"string",
               "value":null
            },
            {
               "id":40,
               "name":"Short describe",
               "internal_name":null,
               "field_format":"string",
               "value":""
            }
         ]
      }
   ]"total_count":1772,
   "offset":0,
   "limit":1
}

And I don't know how to convert this Json "completely" to a dataframe. Respectively, I just want what's in projects. But when I do this:

df = pd.DataFrame(data['projects'])

Although I only get the dataframe from projects, in some columns (for example: author or custom_fields) the format will still remain undecomposed and I would like to decompose it in these columns as well. can anyone advise?

I expect:

author.id	author.name	author.easy_external_id	sum_time_entries	currency	custom_fields.id	custom_fields.name	etc..
163	MyApp	null	0	EUR	42	System	...

Can you make an example of how you expect the df to be? – roschach Nov 09 '21 at 09:35 — roschach, Nov 09 '21 at 09:35

score 0 · Answer 1 · answered Nov 09 '21 at 09:35

0

Try:

df = pd.json_normalize(data['projects'])

See documentation here.

answered Nov 09 '21 at 09:35

Bashton

339
1
11

When I do: df = pd.json_normalize(data['projects']) the author will decompose right(author.name, author.id etc..) for me, but custom_fields will remain undecomposed – Cesc Nov 09 '21 at 09:57
The answer is missing because there's an embedded field wich is custom_fields. I've added updates to the code and I believe this will work because you can have N field in the custon_fields. ` df = pd.json_normalize(a['projects']) df_1 = df['custom_fields'].apply(pd.Series) counter = 0 for col in df_1.columns: res = pd.json_normalize(df_1[col]) res.columns = ['custom_fields_{}'.format(col_name+str(counter)) for col_name in res.columns] counter= counter +1 df = pd.concat([df,res],axis=1) df = df.drop(["custom_fields"],axis=1) df` @Cesc – alphaBetaGamma Nov 09 '21 at 10:06
@alphaBetaGamma Would it please show more legible, or through some link? Thank you. – Cesc Nov 09 '21 at 10:14
1

`df = pd.json_normalize(a['projects']) #newligne df_1 = df['custom_fields'].apply(pd.Series) #newligne counter = 0 for col in df_1.columns: #newligne res = pd.json_normalize(df_1[col]) #newligne res.columns = ['custom_fields_{}'.format(col_name+str(counter)) for col_name in res.columns] #newligne counter = counter + 1 #newligne df = pd.concat([df,res],axis=1) #newligne df = df.drop(["custom_fields"],axis=1) /n df ` – alphaBetaGamma Nov 09 '21 at 10:24
1

Apologies but the comments doesn't support newlines so I added comment to explain that. `df = pd.json_normalize(a['projects']) #newline df_1 = df['custom_fields'].apply(pd.Series) #newline counter = 0 #newline for col in df_1.columns: #newline /// inside loop ///// res = pd.json_normalize(df_1[col]) #newline res.columns = ['custom_fields_{}'.format(col_name+str(counter)) for col_name in res.columns] #newline counter = counter + 1 #newligne df = pd.concat([df,res],axis=1) #newline ///////endloopp//// df = df.drop(["custom_fields"],axis=1) #newline df ` @Cesc – alphaBetaGamma Nov 09 '21 at 10:30

score 0 · Answer 2 · answered Nov 09 '21 at 09:41

I tried here and it works... I think the problem is in your JSON file. Try doing:

data = {'projects': [{'author': {'id': 163,
    'name': 'MyApp',
    'easy_external_id': None},
   'sum_time_entries': 0,
   'sum_estimated_hours': 29,
   'currency': 'EUR',
   'custom_fields': [{'id': 42,
     'name': 'System',
     'internal_name': None,
     'field_format': 'string',
     'value': None},
    {'id': 40,
     'name': 'Short describe',
     'internal_name': None,
     'field_format': 'string',
     'value': ''}]}],
 'total_count': 1772,
 'offset': 0,
 'limit': 1}

Not with me. When I do: `df = pd.json_normalize(data['projects'])` the `author` will decompose right(author.name, author.id etc..) for me, but `custom_fields` will remain undecomposed. — Cesc, Nov 09 '21 at 09:56

Python JSON to dataframe

2 Answers2