0

As I am new in Big Data Platform, I would like like to do some feature engineering work with my data. The Database size is about 30-50 Gb. Is is possible to load the full data (30-50Gb) in a data frame like pandas data frame?

The Database used here is Oracle. I tried to load it but I am getting out of memory error. Furthermore I like to work in Python.

APC
  • 144,005
  • 19
  • 170
  • 281
Taimur Islam
  • 960
  • 2
  • 11
  • 25
  • full data (30-50Gb) – Taimur Islam Jan 17 '19 at 09:32
  • 1
    There's lot of useful tips [in this answer here](https://stackoverflow.com/a/44207661/146325). Also read [this question's thread](https://stackoverflow.com/q/11622652/146325) – APC Jan 17 '19 at 09:32

1 Answers1

1

pandas is not good if you have GBS of data it would be better to use distributed architecture to improve speed and efficiency. There is a library called DASK that can load large data and use distributed architecture.

codebr
  • 11
  • 1