0

I am handling sql on python, and I'm more familiar with pandas dataframe than SQL statements so I want to import whole data on MySQL DB and handle it on python. And, I'm now a little afraid of my memory error.

Is pandas needs memory for containing all data? or is it enough even if memory is small enough to just contain the results?

import pandas as pd
import sqlalchemy
engine = sqlalchemy.create_engine("mysql://" + "root" + ":" + "password" + "@" + "localhost" + "/" + "mydb")

df = pd.read_sql_table('table1', engine)

# some operations

print(df)
Shadow
  • 33,525
  • 10
  • 51
  • 64
Snoopy
  • 38
  • 5

2 Answers2

0

You need more memory than the results because you will do some operations on them, so these operations need memory too. In addition, the memory will not only contain the results of your query, but also a lot of data that already exists. If memory is an issue, you can pull data in small numbers using the LIMIT and OFFSET clauses.

  • so memory M can be all_data_from_sql > M > result+rest ? – Snoopy Dec 05 '22 at 07:54
  • If in your comment, all_data_from_sql equals data pulled from MySQL in one query, the answer is "No" because it must fit in memory. You must pull smaller data than memory and make operations on them, after that you pulled another chunk and make operations on it. You can't hold the all data in pandas at one time if it's not fit into the memory. – Ali Enes İşbilen Dec 05 '22 at 08:02
0

Pandas needs more memory than the size of your data.

Please refer to this thread if you want to query just a part of your database to avoid running out of memory.

ffrosch
  • 899
  • 6
  • 15