0

i have a dataframe of with 2000 columns, and would like to write a fast code to split this dataframe into 10 groups of 200 columns.

df_name = ['df1','df2','df3','df4','df5','df6','df7','df8','df9','df10']

for name in df_name:
    for n in np.arange(0,2000,200):
        name = df[df.columns[n:n+200]]
Noel
  • 1
  • 2
  • You can use that piece of code if you use a dictionary with keys df1, df2... – Raf Apr 13 '19 at 14:51
  • This question asks about splitting the columns into different sets, while https://stackoverflow.com/questions/17315737/split-a-large-pandas-dataframe refers to splitting the lines into different sets. – Xavier Nodet Apr 13 '19 at 16:53

1 Answers1

0

Because you cannot dynamically build environment objects by string assignment with name = ..., consider building a dictionary of data frames using a dictionary comprehension that includes zip to iterate elementwise through df_name and 200 multiples:

df_dict = {k:df[df.columns[n:n+199]] \ 
                for k,n in zip(df_name, range(0,2000,200))}

You lose no functionality of data frame if stored within a container like tuple, list, or dictionary:

df_dict['df1'].describe()
df_dict['df2'].head()
df_dict['df3'].tail()
...
Parfait
  • 104,375
  • 17
  • 94
  • 125