For example, ml.p2.8xlarge for training job at ap-northeast on Sagemaker takes 16.408 USD / hour, but p2.8xlarge for on-demand at ap-northeast on Ec2 takes 12.336 USD/hour. Is it cheap to just train the DL models on Ec2 rather than Sagemaker if we only use it for training?
5 Answers
I've listed some pros and cons from experience...
..., as opposed to marketing materials. If I were to guess, I'd say you have a much higher chance to experience all the drawbacks of SageMaker, than any one of the benefits.
Drawbacks
- Cloud vendor lock in: free improvements in the open source projects in the future and better prices in competitor vendors are difficult to get. Why don't AWS invest developers in JupyterLab, they have done limited work in open source. Find some great points here, where people have experienced companies using as few AWS services as possible with good effect.
- SageMaker instances are currently 40% more expensive than their EC2 equivalent.
- Slow startup, it will break your workflow if every time you start the machine, it takes ~5 minutes. SageMaker Studio apparently speeds this up, but not without other issues. This is completely unacceptable when you are trying to code or run applications.
- SageMaker Studio is the first thing they show you when you enter SageMaker console. It should really be the last thing you consider.
- SageMaker Studio is more limited than SageMaker notebook instances. For example, you cannot mount an EFS drive.I spoke to a AWS solutions architect, and he confirmed this was impossible (after looking for the answer all over the internet). It is also very new, so there is almost no support on it, even by AWS developers.
- Worsens the disorganised Notebooks problem. Notebooks in a file system can be much easier to organise than using JupyterLab. With SageMaker Studio, a new volume gets created and your notebooks lives in there. What happens when you have more than 1...
- Awful/ limited terminal experience, coupled with tedious configuration (via Lifecycle configuration scripts, which require the Notebook to be turned off just to edit these scripts). Additionally, you cannot set any lifecycle configurations for Studio Notebooks.
- SageMaker endpoints are limited compared to running your own server in an EC2 instance.
- It may seem like it allows you to skip certain challenges, but in fact it provides you with more obscure challenges that no one has solved. Good luck solving them. The rigidity of SageMaker and lack of documentation means lots of workarounds and pain. This is very expensive.
Benefits
These revolve around the SageMaker SDK (the Sagemaker console and SageMaker SDK) (please comment or edit if you found any more benefits)
- Built in algorithms (which you can easily just import in your machine learning framework of choice): I would say this is worse than using open source alternatives.
- Training many models easily during hyperparameter search YouTube video by AWS (a fast way to spend money)
- Easily create machine learning related AWS mechanical turk tasks. However, mturk is very limited within SageMaker, so youre better off going to mturk yourself.
My suggestion
If you're thinking about ML on the cloud, don't use SageMaker. Spin up a VM with a prebuilt image that has PyTorch/ TensorFlow and JupyterLab and get the work done.

- 22,056
- 10
- 114
- 167
-
Other benefits include: aws service integration (spark & step functions SDKs, cloudwatch metrics, IoT greengrass edge deploy, fargate/ecs deploy), BYOA/BYOM (script mode for mxnet, tensorflow, and pytorch), serverless inference (batch transform & hosting services), fully managed infra (easily spin up multi-gpu/cpu orchestration, ready pre-built containers for mxnet/tf/torch). The drawbacks you list can be subjective. Vendor lock-in is inevitable with any cloud provider and 3rd party platform. Curious what platform for data sci/eng you prefer/recommend? – thePurplePython Oct 17 '20 at 17:03
-
Thanks for listing the other benefits. I currently don't do much work on notebooks, but I would be interested in trying out the AI Platforms Notebooks by Google the next time I use notebooks. – Ben Butterworth Oct 19 '20 at 20:04
-
1`Vendor lock-in is inevitable with any cloud provider and 3rd party platform`. It is not inevitable, these providers often provide a service which works on top of open source alternatives so it is easier to change providers. These services can have a higher price than using a vendor lock-in alternative. Just [duck](https://duckduckgo.com/?q=cloud+vendor+lock+in+alternatives&atb=v247-1&ia=web&iai=r1-0&page=1&sexp=%7B%22prodexp%22%3A%22b%22%2C%22prdsdexp%22%3A%22c%22%2C%22biaexp%22%3A%22b%22%2C%22msvrtexp%22%3A%22b%22%2C%22bltexp%22%3A%22b%22%7D) it. – Ben Butterworth Dec 15 '20 at 16:27
-
1Okay @thePurplePython, I finally got round to actually reading your comment. Do you work for AWS? It seems like you're listing these integrations which I would highly recommend people avoid/ do without AWS-wrappers entirely: e.g. Prebuilt containers can run on EC2, why does SageMaker need 5 minutes to start one up. I'm not sure about any of these "integrations". I don't know anything about pyspark/ step functions, maybe nobody from AWS knows [either](https://stackoverflow.com/questions/65041847/sagemaker-processing-job-with-pyspark-and-step-functions). – Ben Butterworth Dec 21 '20 at 11:40
-
1As of Feb 2022, EC2 is only 15-20% cheaper for small to medium size instances. – Pab Feb 20 '22 at 06:24
-
Sagemaker doesn't take 5 minutes to start-up. Only the sagemaker sub-service "Sagemaker Notebook" take 2-3 min to start. It make no sense comparing EC2 and Notebook. Sagemaker Processing, Training and Inference starts as fast as EC2. – Hugo Jun 16 '22 at 07:53
-
The whole answer is misleading : it compares EC2 (where you buy computing power) and Sagemaker Notebook/Studio (wher you buy a dev environement with computing power) . It makes more sense to compare EC2 (serverful computing) with Sagemaker Processing (serverless computing) – Hugo Jun 16 '22 at 07:56
-
Hugo, this answer was written almost 2 years ago, I'm not sure if launch time has been optimised in that time. Comparing "computing power" and a "dev environment" does not make it "misleading". In fact it's led from my experience using the products. If you'd like to create a better answer or constructive comments, feel free – Ben Butterworth Jun 16 '22 at 08:29
-
Here's an interesting comparison of an Amazon AI product vs. open source: `We found that Amazon Forecast is 60% less accurate and 669 times more expensive than running an open-source alternative in a simple cloud server.` [source](https://www.reddit.com/r/MachineLearning/comments/zk6h8q/discussion_amazons_automl_vs_open_source/). – Ben Butterworth Jan 09 '23 at 08:09
You are correct about EC2 being cheaper than Sagemaker. However you have to understand their differences.
- EC2 provides you computing power
- Sagemaker (try to) provides a fully configured environment and computing power with a seamless deployment model for you to start training your model on day one
If you look at Sagemaker's overview page, it comes with Jupyter notebooks, pre-installed machine learning algorithms, optimized performance, seamless rollout to production etc.
Note that this is the same as self-hosting a EC2 MYSQL server and utilizing AWS managed RDS MYSQL. Managed services always appears to be more expensive, but if you factor in the time you have to spent maintaing server, updating packages etc., the extra 30% cost may be worth it.
So in conclusion if you rather save some money and have the time to set up your own server or environment, go for EC2. If you do not want to be bothered with these work and want to start training as soon as possible, use Sagemaker.

- 3,457
- 1
- 18
- 31
-
I need to process 7.5 MB/ s data in SageMaker. What machine size is right for me? should I also include GPU? – dotnetavalanche Dec 24 '18 at 09:55
-
@dotnetavalanche sagemaker is not for stream processing. You need flink, beam or spark streaming like framework. – halil Apr 24 '19 at 11:21
UPDATE 2022-Apr SageMaker instances are 24% more expensive on average than equivalent EC2 instances - source: @amirathi
OUTDATED 2021-Oct The average premium cost has lowered from previous +30% to +20% meaning SageMaker is becoming cheaper over the years. Disclaimer: I'm only checking EU pricing.
OUTDATED 2020-Nov It is no longer the case that SageMaker/EC2 (Training) cost ratio is +40%. As of 2020 it's closer to +30% though in depends on the instance type:
SM Instance Type[1] | SM Cost[1] | EC2 Instance Type[2] | EC2 Cost[2] | Ratio |
---|---|---|---|---|
ml.p3.2xlarge | 4.779 | p3.2xlarge | 3.823 | 1.25 |
ml.p3.8xlarge | 18.35 | p3.8xlarge | 15.292 | 1.20 |
ml.p3.16xlarge | 35.172 | p3.16xlarge | 30.584 | 1.15 |
ml.g4dn.xlarge | 0.921 | g4dn.xlarge | 0.658 | 1.40 |
ml.g4dn.2xlarge | 1.175 | g4dn.2xlarge | 0.94 | 1.25 |
ml.g4dn.4xlarge | 1.881 | g4dn.4xlarge | 1.505 | 1.25 |
ml.g4dn.8xlarge | 3.4 | g4dn.8xlarge | 2.72 | 1.25 |
ml.g4dn.12xlarge | 6.112 | g4dn.12xlarge | 4.89 | 1.25 |
ml.g4dn.16xlarge | 6.8 | g4dn.16xlarge | 5.44 | 1.25 |
ml.g5.xlarge | 1.761 | g5.xlarge | 1.258 | 1.40 |
ml.g5.2xlarge | 1.895 | g5.2xlarge | 1.5156 | 1.25 |
ml.g5.8xlarge | 3.827 | g5.8xlarge | 3.06122 | 1.25 |
ml.g5.48xlarge | 25.46 | g5.48xlarge | 20.3681 | 1.25 |
ml.p4d.24xlarge | 47.086 | p4d.24xlarge | 40.94475 | 1.15 |
------------------- | ----------- | -------------------- | ----------- | ------ |
AVERAGE: | 1.25 |
[1] SageMaker prices are USD, eu-central-1 on-demand per hour, for Training (not hosting). Source: https://aws.amazon.com/sagemaker/pricing/
[2] All EC2 prices are USD for eu-central-1 on-demand per hour. Source: https://aws.amazon.com/ec2/pricing/on-demand/
Source: https://docs.google.com/spreadsheets/d/1g1uMPQm48pRlKE6Vv1fYIKzMIxOaG-6Sa43U1y0GU_I/

- 16,355
- 12
- 77
- 110
-
1The pricing is region specific. Your prices seem almost correct for only us-east and us-west-oregon. – Ben Butterworth Dec 21 '20 at 11:56
-
1
-
1As of April-2022, Sagemaker instances are 24% more expensive on average than equivalent EC2 instances. – amirathi Apr 10 '22 at 07:27
If the question is about cost, then EC2. Sagemaker instances are charged at 25% premium.
You need to decide on your usecase. If you want to build model and deploy as API, sagemaker might be a solution as it has this end to end.
Also, Sagemaker now is supporting distributed trainings on multi node which is not easy to setup if you provision EC2 Instances as you need to setup VPCs and Networking. Which is a nightmare on AWS.
If you just need cloud instance for training, go for EC2. Vendor Lock in brought up as a point.
I'm shamelessly talking about our product here. Apologies on that guys. We have https://netbook.ai/. Its sagemaker for any cloud. You can connect your cloud credentials and use it through this.

- 61
- 1
- 1
EC2 is absolutely low cast for small use cases but for big use cases, the maintenance and enhancement cost will be more in future. Also it will be a lot of engineering work/cost when you try to implement such kind of features
1. Auto-scaling : add the instances on run-time as per the load. Distributing load and creating and maintaining such type of infrastructure will be very costly
2. Multi Model Server : If you want to merge multiple endpoints so you can use your infra on full potential will be not easy
3. Versioning and Data Management : If you want to version your model accurately, manage their source code with data, it will not be easy in EC2 instances
4. Model Training Cycle : If you want to crate automatic model training cycle based on data receive, you need to create complete workflow which is very easy in Sagemaker
5. Incremental Learning or Transfer Learning : If you want to do model learning or transfer learning kind things, it will be hard to maintain on EC2 and it will be costly too
6. Elastic Inference : To speed-up your model performance in case of deep learning and reduce latency, this is out of the box functionality, which is costly in case of EC2, both development and then running cost
7. DevOps Integration : Sagemaker is giving out of the box CLI feature for DevOps integration which you need to develop for EC2 instances
I still feel, for small applications, Sagemaker 2-3 times costly as it charge per hour but you can use Sagemaker for batch process like up the instance once, do all the predictions for your permutations, store it in database and use it for serving and for bigger applications, use it as real time predictor.

- 1,492
- 4
- 17
- 31
-
1#3 is very easy on EC2, namely through open source tools like DVC and git. – tmthyjames Oct 29 '20 at 17:52