import pandas as pd
import numpy as np
df = pd.DataFrame({'ser_no': [1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0],
'co_nm': ['aa', 'aa', 'aa', 'bb', 'bb', 'bb', 'bb', 'cc', 'cc', 'cc', 'aaa', 'aaa', 'aaa', 'bba', 'bba', 'bba', 'bba', 'cca', 'cca', 'cca'],
'lat': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]})
df_splits = np.array_split(df, 4)
This is how I split dataframe. I want the split to be such that no unique 'co_nm' value are spread out i.e. say all the row for which 'co_nm'== 'aa' should be in one split. same goes for all values of 'co_nm'.
Is this possible? existing method can't do it seems.