0

Ok here is my question I wan to use multindexing so that I have a 3-d df. I can use

df = pd.concat([df1, df2], keys=('df1','df2'))

but how can I add a new df3 on the df? Essentially I want to add a new df in a loop in an append mode? I have a few thousand dfs and storing all of them before I concat them wont be efficient. Is there a way to do that?

more specific lets assume I have the following df's

df1 = pd.DataFrame(columns=['a', 'b', 'c'])
df2 = pd.DataFrame(columns=['a', 'b', 'c'])
df1.loc['index_1','b'] = 1
df1.loc['index_2','a'] = 2

df2.loc['index_7','a'] = 5
df3 = pd.DataFrame(columns=rating_matrix.columns)
df3.loc['index_9','c'] = 1

df = pd.concat([df1, df2], keys=('df1','df2'))


    a   b   c
df1     index_1     NaN     1   NaN
        index_2     2   NaN     NaN
df2     index_7     5   NaN     NaN

hopw can I add in a similar way df3?

saias
  • 406
  • 1
  • 3
  • 12
  • Appending them in a loop will be even less efficient, as it needlessly copies the entire DataFrame every iteration. Typically you append the DataFrames to a list, and concatenate once. See the last paragraph of [Unutbu's Solution](https://stackoverflow.com/a/31675177/4333359). – ALollz Dec 07 '18 at 14:58

1 Answers1

0

So after a bit of search I found that the best way is to create the final df first, reset its index and set the final multi index. It should look something like that:

# create df's
df1 = pd.DataFrame(columns=['a', 'b', 'c'])
df2 = pd.DataFrame(columns=['a', 'b', 'c'])
df3 = pd.DataFrame(columns=['a', 'b', 'c'])

df1.loc['index_1','b'] = 1
df1.loc['index_2','a'] = 2
df2.loc['index_7','a'] = 5
df3.loc['index_9','c'] = 1

# add index in the form of a column
df1['df'] = 'df1' 
df2['df'] = 'df2'
df3['df'] = 'df3'

# reset index and set multiindex
df = pd.concat([df1, df2, df3], sort=True)
df.reset_index(inplace=True)
df.set_index(['df', 'index'], inplace=True)
df



                         a       b       c
df  index           
df1         index_1     NaN      1      NaN
            index_2      2      NaN     NaN
df2         index_7      5      NaN     NaN
df3         index_9     NaN     NaN      1
saias
  • 406
  • 1
  • 3
  • 12