I have a csv file that read using pandas, I' want to split the dataframe in chunks in a specified column:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
list_of_classes=[]
# Reading file
fileName = 'Training.csv'
df = pd.read_csv(fileName)
classID = df.iloc[:,-2]
len(classID)
df.iloc[0,-2]
for i in range(len(classID)):
print(classID[i])
if classID[i] not in list_of_classes:
list_of_classes.append(classID[i])
for i in range(len(df)):
...............................
UPDATE
Say the dataframe looks like :
........................................
Feature0 Feature1 Feature2 Feature3 ......... classID lastColum
190 565 35474 0.336283 2.973684 255 0
311 984 113199 0.316057 3.163987 155 0
310 984 94197 0.315041 3.174194 1005 0
280 984 116359 0.284553 3.514286 255 18
249 984 107482 0.253049 3.951807 1005 0
283 984 132343 0.287602 3.477032 155 0
213 984 88244 0.216463 4.619718 255 0
839 984 203139 0.852642 1.172825 255 0
376 984 105133 0.382114 2.617021 1005 0
324 984 129209 0.329268 3.037037 1005 0
in this example the result that I'm aiming to get, is 3 dataframes, each of them has only 1 classID either 155, 1005, or 255. my question is, is there a finer way to do this ?