How to optimize runtime while running nested for loops to unwind a df with JSON objects to a new Pandas Dataframe?

Question

currently I have a df with json objects inside, which looks something like this

df: with n rows containing json objects in a column

json_object = { a1, a2, a3, a4, ...}
where a1 = {b1, b2, b3, b4, ....}
}

df looks like this:

index   col
 1.     [ { {}, {}, .. }, { {}, {}, .. }, { {}, {}, .. },...]
 2.     [ { {}, {}, .. }, { {}, {}, .. }, { {}, {}, .. },...]

final output should end up this way:

index   col
1.      a1->b1
1.      a1->b2
1.      a1->b3
1.      a1->b4
1.      a2->b1
1.      a2->b2
...
...

my current approach:

# iterate all rows in df
for i in range(n):
    # unpack each json object
    for j in range(len(i)):
        # unpack each element inside a json object
        for k in range(len(j)):
           populate to a new dataframe

as the above approach takes worst case scenario of n^3 runtime, it is taking too long, is there a better way of doing the same process?

Please add a small sample from the actual data; you basically need to use `json.loads` then explode/apply pd.Series. — ThePyGuy, Oct 20 '21 at 07:11
[JFYI](https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples) — Danila Ganchar, Oct 22 '21 at 16:00

How to optimize runtime while running nested for loops to unwind a df with JSON objects to a new Pandas Dataframe?

0 Answers0