Highest Voted 'aws-emr-studio' Questions

4

votes

1 answer

unable to read s3 files from within aws emr studio notebooks or consoles

We have an EMR Studio that has an S3 default bucket set, i.e. s3://OurBucketName/Subdirectory/work, and within which we've created a Workspace that is attached to an EC2 cluster running emr-6.10.0 with the following apps installed: Hadoop…

asked Mar 16 '23 at 20:14

dragonscience

41
3

2

votes

1 answer

Orchestration of jobs using AWS Step functions using EMR Serverless

Recently Amazon launched EMR Serverless and I want to repurpose my exiting data pipeline orchestration that uses AWS Step Functions: There are steps that create EMR cluster, run some lambda functions, submit Spark Jobs (mostly Scala jobs using…

amazon-emr aws-step-functions aws-emr-studio

asked Jun 10 '22 at 13:59

smishra

3,122
29
31

1

vote

1 answer

How to automate jupyter notebook execution on aws?

I got a task to complete where I need to automate Jupyter notebook execution on AWS. I'm totally new to AWS environment so don't have any idea how to do it efficiently. Things I need to do are the following - Need REST API(s) to start and stop…

amazon-web-services aws-lambda jupyter-notebook amazon-sagemaker aws-emr-studio

asked Oct 30 '21 at 13:09

user22

112
1
9

1

vote

1 answer

When I save a PySpark DataFrame with saveAsTable in AWS EMR Studio, where does it get saved?

I can save a dataframe using df.write.saveAsTable('tableName') and read the subsequent table with spark.table('tableName') but I'm not sure where the table is actually getting saved?

python amazon-web-services pyspark amazon-emr aws-emr-studio

asked Aug 24 '21 at 13:18

Tom

11
1

1

vote

1 answer

How to create a notebook in EMR Studio using boto3?

I am going through the boto3 documentation here: https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/emr.html#EMR.Client.create_studio but I cannot see any sort of create/delete notebook for EMR studio. Only create/delete…

amazon-web-services boto3 amazon-emr aws-emr-studio

asked May 19 '21 at 12:16

Randomize

8,651
18
78
133

0

votes

0 answers

Simple UDF apply function from the doc is failing with Spark 3.3

This simple code from the latest doc does not work on the EMR Studio Spark cluster (current version: 3.3.1-amzn-0) df = spark.createDataFrame( [(1, 1.0), (1, 2.0), (2, 3.0), (2, 5.0), (2, 10.0)], ("id", "v")) def subtract_mean(pdf:…

pyspark jupyter-notebook user-defined-functions amazon-emr aws-emr-studio

asked Sep 01 '23 at 18:45

mountrix

1,126
15
32

0

votes

0 answers

Databricks format in Pyspark to write in Redshift

I am migrating data from postgres to redshift by using jdbc format but for the redshift if I ise jdbc format then some of the options are not available like escape. So I thought to use format com.databricks.spark.redshift to write by using pyspark.…

apache-spark pyspark aws-glue spark-redshift aws-emr-studio

asked Jun 10 '23 at 04:06

vish anand

111
1
4

0

votes

0 answers

Referencing other notebooks in AWS EMR

I am new to AWS EMR, and trying to configure to run it for a code which was developed on my local. I am basically referencing notebooks within a Masternotebook, this set-up works on my local but not on AWS EMR. I am trying to execute this line …

pyspark aws-serverless aws-emr-studio

asked May 29 '23 at 14:02

Shanawaz Khan

11
2

0

votes

1 answer

How to read postgres DB tables through EMR jupyter lab notebook from amazon workspace

I'm trying to read the table from postgres tables. but i'm facing below error. Note: i cannot be able to refer external files from local since it is a private workspace. JDBC :…

postgresql apache-spark pyspark amazon-emr aws-emr-studio

asked Dec 20 '22 at 06:13

Sabarish Mahalingam

15
2

0

votes

0 answers

AWS EMR 6.9 with spark 3.0 and JupyterEnterpriseGateway fails with bootstrapping errors

Struggling to bring up EMR cluster with spark 3.x. Using custom / advanced options since I also need JupyterEnterpriseGateway, however bootstrapping fails with unknown errors. Using one of the options available in the preselected packages works but…

amazon-emr aws-emr-studio

asked Nov 27 '22 at 06:30

Mayukh

117
1
4

0

votes

0 answers

Installing Packages onto EMR

I have been scouring the internet for documentations and solutions on the internet to solve this issue that I have been encountering on EMR but so no luck! I have been trying to download some packages onto my EMR workspaces, but it throws out the…

amazon-web-services amazon-emr aws-emr-studio

asked Nov 23 '22 at 22:29

thundercat

45
6

Questions tagged [aws-emr-studio]