How can i split a dataframe into groups by its columns using a for loop, splitting df only by its columns, not rows

Question

i have a dataframe of with 2000 columns, and would like to write a fast code to split this dataframe into 10 groups of 200 columns.

df_name = ['df1','df2','df3','df4','df5','df6','df7','df8','df9','df10']

for name in df_name:
    for n in np.arange(0,2000,200):
        name = df[df.columns[n:n+200]]

You can use that piece of code if you use a dictionary with keys df1, df2... — Raf, Apr 13 '19 at 14:51
This question asks about splitting the columns into different sets, while https://stackoverflow.com/questions/17315737/split-a-large-pandas-dataframe refers to splitting the lines into different sets. — Xavier Nodet, Apr 13 '19 at 16:53

score 0 · Answer 1 · answered Apr 13 '19 at 15:29

Because you cannot dynamically build environment objects by string assignment with name = ..., consider building a dictionary of data frames using a dictionary comprehension that includes zip to iterate elementwise through df_name and 200 multiples:

df_dict = {k:df[df.columns[n:n+199]] \ 
                for k,n in zip(df_name, range(0,2000,200))}

You lose no functionality of data frame if stored within a container like tuple, list, or dictionary:

df_dict['df1'].describe()
df_dict['df2'].head()
df_dict['df3'].tail()
...

How can i split a dataframe into groups by its columns using a for loop, splitting df only by its columns, not rows

1 Answers1