How can I fill a dataframe from a recursive dictionary values?

Question

I have created a script that allows me to read multiple pdf files and extract information recursively one by one. This script generates a dictionary with data by pdf. Ex: 1º Iteration from 1º PDF file:

d = {"GGT":["transl","mut"], "ATT":["alt3"], "ATC":["alt5"], "AUC":["alteration"]}

2º In the Second Iteration from 2º PDF file:

d = {"GGT":["transl","mut"], "AUC":["alteration"]}

. . . Doing this until 200 pdf files.

Initially I have a dataframe created with all the genes that allow to detect that analysis.

df = pd.DataFrame(data=None, columns=["GGT","AUC","ATC","ATT","UUU","UUT"], dtype=None, copy=False)

Desire output: What I would like to obtain is a dataframe where the information of the values is stored in a recursive way line by line. For example:

Is there an easy way to implement this? or functions that can help me?

What does 1º mean? Did you mean №1? Or is this some scientific notation? — Ivan Gorin, Jan 01 '21 at 23:47
how are the different dictionaries stored, e.g. your 1 degree and 2 degree dictionaries? Are they 200 separate variables stored as dictionaries? Or one list of 200 dictionaries? Also, it sounds like you are simply trying to create 200 X 6 dataframe based off the key, values of the 200 dictionaries? — David Erickson, Jan 01 '21 at 23:55
@david Erickson The dictionary is created in a for loop that read each pdf one by one, not creating 200 separated variables. For that reason I would like to export the information to dataframe and then read another pdf and replace dictionary variable with the new data. I don't know if I answered your question.. — Enrique, Jan 02 '21 at 00:06
Does my answer solve this part of your problem: " For that reason I would like to export the information to dataframe"? — David Erickson, Jan 02 '21 at 00:12

score 2 · Accepted Answer · answered Jan 02 '21 at 00:05

IIUC, you are trying to loop through the dictionaries and add them as rows in your dataframe? I'm not sure how this applies to recursion with "What I would like to obtain is a dataframe where the information of the values is stored in a recursive way line by line."

d1 = {"GGT":["transl","mut"], "ATT":["alt3"], "ATC":["alt5"], "AUC":["alteration"]}
d2 = {"GGT":["transl","mut"], "AUC":["alteration"]}
dicts = [d1, d2] #imagine this list contains the 200 dictionaries
df = pd.DataFrame(data=None, columns=["GGT","AUC","ATC","ATT","UUU","UUT"], dtype=None, copy=False)
for d in dicts: #since only 200 rows a simple loop with append
    df = df.append(d, ignore_index=True)
df
Out[1]: 
             GGT           AUC     ATC     ATT  UUU  UUT
0  [transl, mut]  [alteration]  [alt5]  [alt3]  NaN  NaN
1  [transl, mut]  [alteration]     NaN     NaN  NaN  NaN

How can I fill a dataframe from a recursive dictionary values?

1 Answers1

Linked