This is my code:
data = pd.read_csv(annotated_labelled_csv)
l = int(floor(len(data['#filename'])/num_process))
print(l)
for i in range(0,num_process,1):
with open("./Pfiles/"+str(i)+".pkl", 'wb') as handle:
print(len(data.iloc[:, l*i : (l*(i+1))-1]))
pickle.dump(
(data.iloc[:, l*i : (l*(i+1))-1].to_dict('dict')),
handle,
protocol=pickle.HIGHEST_PROTOCOL
)
The length of dataframe is 8933 and I am dividing it by num_process = 19
that gives 470. Dataframe needs to be divided into 19 smaller dataframes. I have written the code above but it is not dividing the dataframe. The length it gives of the sub-dataframe is 8933.