Im kinda new to Python and Datascience.
I have a 33gb csv file Dataset, and i want to parse it in a DataFrame to do some stuff on it.
I tried to do it the 'Casual' with pandas.read_csv
and it's taking ages to parse..
I searched on the internet and found this article.
It says that the most efficent way to read a large csv file is to use csv.DictReader
.
So i tried to do that :
import pandas as pd
import csv
df = pd.DataFrame(csv.DictReader(open("MyFilePath")))
Even with this solution it's taking ages to do the job..
Can you please guys tell me what's the most efficient way to parse a large dataset into pandas?