Sorry in advance if the question seems to be very dumb. I am from a non-technical background and have just started my data science journey. I have a MySQL Db consisting of records of size 50 GB. The jupyter client is installed on a server. I want to understand where is the data that is accessed using MySQL connector and "pd.read_sql" is stored in the jupyter. Also, what configuration of the server will I need to have if I want to work with such a huge DB.
Asked
Active
Viewed 66 times
1 Answers
0
If you are reading with pandas api then I believe it will be read into memory and working with data that size would be somewhat difficult depending on the servers resources.
You may want to check out this answer: How to create a large pandas dataframe from an sql query without running out of memory?
They use chunking to process bits of the data at a time, but I might recommend using spark if possible with larger data.

Smurphy0000
- 76
- 1
- 7