I have IntelliJ IDE installed in my laptop. I am trying to do some Bigdata Spark POCs written in Scala. My requirement is that the spark-scala code written in IntelliJ IDE should run in spark cluster when I click Run. My spark cluster is residing in windows azure cloud. How can I achieve this?
-
The usual method is to generate a fat jar using a sbt-assembly plugin, copy the jar to the spark cluster's master node, ssh to the node and run it using spark-submit. You can definitely do the first step within IntelliJ - https://stackoverflow.com/questions/28459333/how-to-build-an-uber-jar-fat-jar-using-sbt-within-intellij-idea. Do you need the other steps to happen within IntelliJ as well? – xan Mar 15 '18 at 06:45
2 Answers
One way is to create a script to run the jar file created, and run that script.
And another way it touse Azure Toolkit plugin.
You can use Azure Toolkit for IntelliJ Intellij Idea plugin to submit, run debug the spark application
Search and install the plugin as below
To submit and run the application you can follow the documentation here
https://azure.microsoft.com/en-us/blog/hdinsight-tool-for-intellij-is-ga/
Here is the example https://learn.microsoft.com/en-us/azure/hdinsight/spark/apache-spark-intellij-tool-plugin
Hope this helps!

- 22,594
- 6
- 51
- 72
-
Thank You Shankar...I Am going through the links and seems like this is what i was looking for. – TomG Mar 15 '18 at 11:48
-
Great that helped you, I am working with the azure and spark but I don't use the plugins, I just use scripts to run. – koiralo Mar 15 '18 at 11:50
-
I was trying to link the HDinsight cluster. but got the error.Authentication Error: Socket operation on nonsocket: connect. Any Idea why this happens. – TomG Mar 15 '18 at 15:51
-
-
I am not sure with the error, Script is to write all the steps that we manually do when submitting the jar to Azure cluster. – koiralo Mar 15 '18 at 15:55
-
step 1:before starting the process you have to download the hadoop bin
https://github.com/steveloughran/winutils/tree/master/hadoop-2.6.0/bin
and you have to set the hadoop home in environment variables example:C:\Hadoop\hadoop
Step2:Then Download the spark of desired version
add the path C:\Hadoop\spark-1.6.0-bin-hadoop2.6\bin to environment variables
step3: open cmd and go to spark folder till bin C:\Hadoop\spark-1.6.0-bin-hadoop2.6\bin and type following command spark-class org.apache.spark.deploy.master.Master it will give the spark master ip like example spark://localhost:7077 step4:open another cmd and go to spark folder till bin and type following command spark-class org.apache.spark.deploy.worker.Worker SparkMasterIp
step5: To check it is working or not we can test by below command C:\Hadoop\spark-1.6.0-bin-hadoop2.6\bin\spark-shell -master masterip
now you can build your jar and submit the jar to spark-submit from cmd

- 206
- 2
- 6