I have data set of 2.5 GB which contain tens of millions of rows
I'm trying to load data like
%%time
import pandas as pd
data=pd.read_csv('C:\\Users\\mahes_000\\Desktop\\yellow.csv',iterator=True,
chunksize=50000)
Where I'm getting multiple of chunksize part and I'm trying to do some operations like
%%time
data.get_chunk().head(5)
data.get_chunk().shape
data.get_chunk().drop(['Rate_Code'],axis=1)
For operation it choose any one chunksize part and do all the operation it. Then what about the remaining parts? How can I do operations on complete data without memory-error.