Any ideas on the limit of rows to use the Numpy array_split
method?
I have a dataframe
with +6m rows and would like to split it in 20 or so chunks.
My attempt followed that described in: Split a large pandas dataframe
using Numpy and the array_split function, however being a very large dataframe
it just goes on forever.
My dataframe
is df which includes 8 columns and 6.6 million rows.
df_split = np.array_split(df,20)
Any ideas on an alternative method to split this? Alternatively tips to improve dataframe performance are also welcomed.