I want to load a 12GB csv file into python and then do analysis. I attempted to use this method
file_input_to_system = pd.read_csv(usrinput)
, but it failed because the method consumed all my RAM.
My goal now is to read the file from hard disk but not read it from RAM. I googled it and found out this sample
f = open("file_path","r")
for row in csv.reader(f):
df = pd.DataFrame(row)
print(df)
f.close()
But I am not sure how to modify it such that it can read a csv and parse it into dataframe.
When I try this one, it can read file and not consume all my memory. However, when I parse it to dataframe, all my memory is consumed.
chunksize = 100
df = pd.read_csv("C:/Users/user/Documents/GitHub/MyfirstRep/export_lage.csv",iterator=True,chunksize=chunksize)
df = pd.concat(df, ignore_index=True)
print(df)