5

Haven't been able to find this answer online, so I'm asking the stackoverflow community...

I'm wondering if DataSpell can connect to a SageMaker instance and use the EC2 instance hardware (i.e. virtual CPUs, GPUs, RAM, etc.) to run data transformations and machine learning model training on python and Jupyter notebook files?

I.e. I want all the advantages of DataSpell on my local computer (git, debugging, auto-complete, refactoring, etc.), while having all the advantages of a SageMaker instance on AWS (scalable compute hardware, fast training, etc.) to run python and Jupyter notebook files.

Thank you.

albz
  • 53
  • 5

2 Answers2

1

I'm exactly in the same boat.

At the moment I still use PyCharm pro and configure it to do remote code execution on my Dev ec2 instance. I can also run Jupiter notebooks remotely and ssh tunnel the remote port to my local machine, then I can actually have jupiter notebooks in my IDE but execute on hardware.

At the time of writing, DataSpell doesn't do remote interpreter execution, so still use PyCharm Pro.

There isn't a simple method to do this with Sagemaker due to firewall constraints. But I've seen someone who configured ssh tunneling from a bastion host. He would ssh from Sagemaker instance to the bastion to create a reverse tunnel. He would then ssh to bastion host and connect to SageMaker instance. This is very cumbersome thou.

Really hoping that there'd be a better solution to this thou.

Leo Ufimtsev
  • 6,240
  • 5
  • 40
  • 48
-1

This can not be done. You can't bring your own IDE to SageMaker. You can use SageMaker's native IDE - SageMaker Studio which will give you an integrated experience with all of SageMaker's capabilities.

I work at AWS and my opinions are my own.

Kirit Thadaka
  • 429
  • 2
  • 5