I have created a data frame from lot of excel files that took about 2 hours by looping to import it in pandas. now I have to work on it in some different times is there any way that I could save it and then load it again efficiently.
Asked
Active
Viewed 212 times
0
-
what is the size of your dataframe ? – k33da_the_bug Jan 24 '21 at 05:12
-
3Does [this](https://stackoverflow.com/questions/17098654/how-to-store-a-dataframe-using-pandas) solves your problem ? – k33da_the_bug Jan 24 '21 at 05:14
-
you can save it as parquet file. – D. Seah Jan 24 '21 at 05:16
-
i have 200 million rows and 100 columns in my DataFrame – raheel Jan 24 '21 at 08:10
1 Answers
0
you can save your pandas data frame and load it any time you want to work on it using. For example:
df1 = pd.DataFrame([['a', 'b'], ['c', 'd']],
index=['row 1', 'row 2'],
columns=['col 1', 'col 2'])
df1.to_excel("output.xlsx")

Ayesha Khan
- 116
- 3
- 9
-
-
that doesnt matter you can still use this code to save your data frame – Ayesha Khan Jan 24 '21 at 10:02
-
-
-
1If you’re working with massive excel files, try not to. They’re very difficult data structures to process — especially when your data is large. Consider serialized formats such as parquet, csv, json, or pickle (python’s binary stream). – Ayesha Khan Jan 24 '21 at 11:11
-