Questions tagged [spark-ec2]

spark-ec2 is designed to manage multiple named clusters.

You can launch a new cluster (telling the script its size and giving it a name), shutdown an existing cluster, or log into a cluster. Each cluster is identified by placing its machines into EC2 security groups whose names are derived from the name of the cluster.

22 questions
9
votes
1 answer

Change hadoop version using spark-ec2

I want to know is it possible to change the hadoop version when the cluster is created by spark-ec2? I tried spark-ec2 -k spark -i ~/.ssh/spark.pem -s 1 launch my-spark-cluster then I login with spark-ec2 -k spark -i ~/.ssh/spark.pem login…
user3684014
  • 1,175
  • 12
  • 26
5
votes
1 answer

Spark: How to increase drive size in slaves

How do I start a cluster with slaves that each have 100GB drive. ./spark-ec2 -k xx -i xx.pem -s 1 --hadoop-major-version=yarn --region=us-east-1 \ --zone=us-east-1b --spark-version=1.6.1 \ --vpc-id=vpc-xx --subnet-id=subnet-xx --ami=ami-yyyyyy \ …
Mohamed Taher Alrefaie
  • 15,698
  • 9
  • 48
  • 66
3
votes
1 answer

How do I resolve "Failed to determine hostname of Instance" error using spark-ec2?

Trying to launch Spark cluster on EC2, getting error "Failed to determine hostname of Instance" (replaced sensitives with *): $ spark-ec2 --vpc-id=vpc-* --subnet-id=subnet-* --slaves=1 --key-pair=* --identity-file=/Users/matthew/.ssh/*…
Matthew Adams
  • 2,059
  • 2
  • 21
  • 31
2
votes
0 answers

EC2 spark-shell failed on connection exception: java.net.ConnectException: Connection ref

I have followed the instructions given on spark website (http://spark.apache.org/docs/latest/ec2-scripts.html) to setup a simple ec2 cluster. but when I start the spark-shell (./spark/bin/spark-shell) I get a connection refuse error. I have added…
add-semi-colons
  • 18,094
  • 55
  • 145
  • 232
2
votes
2 answers

spark-ec2 not recognized when lauching cluster on windows 8.1

I'm a complete beginner on spark. I'm trying to run spark on Amazon EC2, but my system does not recognize "spark-ec2" or "./spark-ec2". It says "spark-ec2" is not recognized as an internal or external command. I followed the instruction here to…
Daolin
  • 614
  • 1
  • 16
  • 41
2
votes
0 answers

Configuring spark-ec2

I noticed that when I start my Spark EC2 cluster from my local machine with spark/ec2/spark-ec2 start mycluster the setup routine has a nasty habit of destroying everything I put in my cluster's spark/conf/. Short of having to run a…
Noah
  • 196
  • 10
1
vote
0 answers

How to run spark-ec2-branch-2 script for Ohio region?

I am trying to run spark-ec2-branch-2 script for creating a cluster in Ohio. I need to create a cluster in Ohio because Ohio is one of the regions that allows VPC peering. ./spark-ec2 --key-pair=ohio --identity-file=ohio.pem --region=us-east-2…
user3086871
  • 671
  • 3
  • 7
  • 25
1
vote
2 answers

Hadoop error when using spark-submit

I am trying to spark-submit using Amazon ec2 with the following: spark-submit --packages org.apache.hadoop:hadoop-aws:2.7.1 --master spark://amazonaws.com SimpleApp.py and I end up with the following error. It seems to be that it is looking for…
Ray.R.Chua
  • 777
  • 3
  • 8
  • 27
1
vote
0 answers

Change amazon-linux to ubuntu when loading clusters using spark_ec2.py

When I launch from the provided scripts for ec2 (spark_ec2.py) the cluster get spun off with amazon-linux nodes. I want it to be ubuntu.(spark_ec2.py => I am currently using the brew version and hope thats not an issue) After searching I found…
add-semi-colons
  • 18,094
  • 55
  • 145
  • 232
1
vote
2 answers

Changing JDK on cluster deployed with ./spark-ec2

I have deployed an Amazon EC2 cluster with Spark like so: ~/spark-ec2 -k spark -i ~/.ssh/spark.pem -s 2 --region=eu-west-1 --spark-version=1.3.1 launch spark-cluster I copy a file I need first to the master and then from master to HDFS…
Aki K
  • 1,222
  • 1
  • 27
  • 49
1
vote
1 answer

How to upgrade Apache Spark version

Currently, I have installed Spark 1.5.0 version on AWS using spark-ec2.sh script. Now, I want to upgrade my Spark version to 1.5.1. How do i do this? Is there any upgrade procedure or do i have to build it from scratch using the spark-ec2 script? In…
Kaushal
  • 3,237
  • 3
  • 29
  • 48
1
vote
0 answers

aws cli installed by spark-ec2 from spark-1.4 is out of date

I launched an on-demand spark cluster using spark 1.4 and spark-ec2. I then logged into the cluster and find that the aws client is ancient. aws --version aws-cli/0.8.2 Python/2.6.9 Linux/3.4.37-40.44.amzn1.x86_64 On my local, the aws client is…
bruce szalwinski
  • 724
  • 1
  • 8
  • 27
1
vote
1 answer

Apache Spark EC2 Script launching slaves but no master

When using the Apache Spark EC2 script to launch a cluster I have found somewhat of a bug which is beginning to hit my pocket. When specifying the number of slaves: if you enter a number which is greater than or equal to your limit then the cluster…
monster
  • 1,762
  • 3
  • 20
  • 38
1
vote
1 answer

Bad SSL Key When Trying to Use spark-ec2 script to launch cluster on EC2?

Version of Apache Spark: spark-1.2.1-bin-hadoop2.4 Platform: Ubuntu I have been using the spark-1.2.1-bin-hadoop2.4/ec2/spark-ec2 script to create temporary clusters on ec2 for testing. All was working well. Then I started to get the following error…
1
vote
0 answers

spark-ec2 and Tachyon hadoop version disparity

I try to use spark-ec2 to launch ec2 cluster with hadoop version 2.x, so I tried: ./spark-ec2 -k spark -i ~/.ssh/spark.pem -s 1 --hadoop-major-version=2 launch my-spark-cluster then I found out there are error in the tachyon setting up…
user3684014
  • 1,175
  • 12
  • 26
1
2