Given a json
string of records where the "schema" for each record is not consistent (e.g. each record does not have the full set of "columns"):
s = """[{"a": 3, "b":[]}, {"a": 4, "b": [4]}, {"a": 5}]"""
A pandas DataFrame
can be constructed from this string:
import pandas as pd
import json
json_df = pd.DataFrame.from_records(json.loads(s))
Which results in
a b
0 3 []
1 4 [4]
2 5 NaN
How can all NaN
instances of a pandas Series
column be filled with empty list
values? The expected resulting DataFrame would be:
a b
0 3 []
1 4 [4]
2 5 []
I have tried the following; none of which worked:
json_df[json_df.b.isna()] = [[]]*json_df[json_df.b.isna()].shape[0]
from itertools import repeat
json_df[json_df.b.isna()] = repeat([], json_df[json_df.b.isna()].shape[0])
import numpy as np
json_df[json_df.b.isna()] = np.repeat([], json_df[json_df.b.isna()].shape[0])
Thank you in advance for your consideration and response.