I want to create a large pd.dataframe, out of 7 files 4GB .txt files, which I want to work with + save to .csv
What I did:
created a for loop and opened-concated one by one on axis=0, and so continuing my index (a timestamp).
However I am running into memory problems, even though I am working on a 100GB Ram server. I read somewhere that pandas takes up 5-10x of the data size.
What are my alternatives?
One is creating an empty csv - opening it + the txt + append a new chunk and saving.
Other ideas?