14

Memory error occurs in amazon sagemaker when preprocessing 2 gb of data which is stored in s3. No problem in loading the data. Dimension of data is 7 million rows and 64 columns. One hot encoding is also not possible. Doing so results in memory error. Notebook instance is ml.t2.medium. How to solve this issue?

James Z
  • 12,209
  • 10
  • 24
  • 44
VaRun Sabu
  • 143
  • 1
  • 1
  • 5
  • 3
    I ran into a similar problem. I opened a terminal (via Jupyter) on the same SageMaker machine. There is *plenty* of memory, both ram and disk (using `free` and `df` to check). It looks like a bug. Everything is working fine in the terminal, and I can allocate memory from there (eg by creating large objects in a Python REPL). – Tyler Aug 22 '18 at 22:16
  • 5
    Was a solution ever found for this? I'm running into it over a year later. – Valevalorin Aug 13 '19 at 21:37

2 Answers2

6

I assume you're processing on the data on the notebook instance, right? t2.medium has only 4GB of RAM, so it's quite possible you're simply running out of memory.

Have you tried a larger instance? The specs are here: https://aws.amazon.com/sagemaker/pricing/instance-types/

Julien Simon
  • 2,605
  • 10
  • 20
0

Can you cut a AWS forum post under, https://forums.aws.amazon.com/forum.jspa?forumID=285? with your question. That way, SageMaker team would be able to help you out.