I have a huge dataset for training a deep learning model. It's in a .csv format. It's around 2GB and right now, I'm just loading the entire data into memory with pandas.
df = pd.read_csv('test.csv')
and then providing everything into the keras model and then training the model like below,
model.fit(df, targets)
I want to know what other options I have when dealing with even large datasets. Like around 10 GB (or) something. I don't have the ram to load everything on to the memory and pass it to the model.
One way I could think of is to somehow get random sample/subset of data from the .csv file and use it via a data generator but the problem is I couldn't find any way to read a subset/sample of a csv file without loading everything into the memory.
How can I train the model without loading everything in to the memory? It's okay if you have any solutions and it uses some memory. Just let me know.