I'm working in a jupyter notebook. I have a large amount of data that I have to initially load and then work with. I don't want to have to reload it every time I shutdown and start my laptop or the notebook. I'm wondering when I save and checkpoint the notebook each time does it save the data that has been loaded and all the work I've done? So if I closed the notebook and re-openned it later I could just start working where I'd left off? Or do I need to use something like pickle? If so could someone please provide an example of how I could use pickle or something similar to save my data and work and reloaded it?
In r I would just save an rdata file and load the file later. I'm a little new to python.
Update:
code:
print(df_business[1:3])
Sample Data:
address attributes \
1 2824 Milton Rd {u'GoodForMeal': {u'dessert': False, u'latenig...
2 337 Danforth Avenue {u'BusinessParking': {u'garage': False, u'stre...
business_id categories \
1 mLwM-h2YhXl2NCgdS84_Bw [Food, Soul Food, Convenience Stores, Restaura...
2 v2WhjAB3PIBA8J8VxG3wEg [Food, Coffee & Tea]
city hours is_open \
1 Charlotte {u'Monday': u'10:00-22:00', u'Tuesday': u'10:0... 0
2 Toronto {u'Monday': u'10:00-19:00', u'Tuesday': u'10:0... 0
latitude longitude name neighborhood \
1 35.236870 -80.741976 South Florida Style Chicken & Ribs Eastland
2 43.677126 -79.353285 The Tea Emporium Riverdale
postal_code review_count stars state
1 28215 4 4.5 NC
2 M4K 1N7 7 4.5 ON
Update2:
Code:
import pickle
your_data = df_business
# Store data (serialize)
with open('filename.pickle', 'wb') as handle:
pickle.dump(your_data, handle, protocol=pickle.HIGHEST_PROTOCOL)
# Load data (deserialize)
with open('filename.pickle', 'rb') as handle:
unserialized_data = pickle.load(handle)