4

I want to know what is the best way to work with Apache Spark using Intellij Idea? (specially for Scala programming language)

Please explain step-by-step if you can.

Thanks for answer

Omid Ebrahimi
  • 1,150
  • 2
  • 20
  • 38

3 Answers3

8

There is good tutorial on Setting up the Spark with Scala on Intellij Idea
Tutorial Link

Let me know if you face any issue.

Ajay Gupta
  • 3,192
  • 1
  • 22
  • 30
  • Thanks, I tried it. But adding spark to dependencies needs to download all spark files for each project and It's very time consuming. Is there any faster and simpler way? This is another tutorial http://hackers-duffers.logdown.com/posts/245018-configuring-spark-with-intellij . What is your idea about it? – Omid Ebrahimi Oct 02 '15 at 05:46
  • 1
    @OmidEbrahimi : As per my suggestion its much better if you can download all the file required for the project as separate.As while deploying the app you will be sure of having the correct version of Scala and all the dependencies related to the app as the production environment can be different in long run.As one app you made today need to fixed tomorrow you dont have to worry about removing the installation and again doing it.By this you can have multiple project in sync with the production. And that tutorial looks good but my 2 cent use the default intelliJ idea Scala and Spark. – Ajay Gupta Oct 02 '15 at 06:01
  • @Boern Let me know if this works for you. Have created a simple doc for the same. – Ajay Gupta Jan 11 '17 at 03:48
  • Works, very kind. Thank you – Boern Jan 11 '17 at 08:31
4

There is likely no free lunch here. I am a Spark contributor in SQL and MLLib areas and have spent untold hours dealing with Intellij and Spark integration. You can google "stackoverflow intellij spark" and that will give an idea.

Follow imAGin's suggestion to look at some of the tutorials. Use the StackOverflow questions and answers (I have put in many myself). You WILL need to invest a lot of time to get this working. And unfortunately it is not a one-time affair. Spark changes constantly - including its dependencies and build files. So it is a difficult and moving target.

eliasah
  • 39,588
  • 11
  • 124
  • 154
WestCoastProjects
  • 58,982
  • 91
  • 316
  • 560
  • Technically this is not an answer but +1 cause I agree with you! – eliasah Oct 02 '15 at 06:02
  • @javadba : It does not take much time to do setup. Its all hyped up. Spark is made to work in StandAlone mode with minimal setting. – Ajay Gupta Oct 02 '15 at 06:04
  • @imAGin Well you were fortunate .. *this time*. But do not speak about more than you are familiar with. Going back to versions 1.3.0 and earlier there were intricate steps involved *every time* one made an update. And with poor to non existent documentation. It was tribal knowledge and trial and error. Even now - did you try to use any non-default settings? Scala 2.11? Different hadoop versions? Specific version of hive? When these become important then watch the fun begin. – WestCoastProjects Oct 02 '15 at 06:05
  • @javadba : Yes Been There Done That, I know its a bit pain. And thats how any person learn and do the stuff. But for the starter Spark with IntelliJ Idea is a breeze. And thanks for the effort what you guys have done so as we are standing some good tools to work with. – Ajay Gupta Oct 02 '15 at 06:17
  • @imAGin OK point taken. Yes if the OP uses only default settings then it may be working well. At this point there do not seem to be any *required* updates to the project and settings just to get the vanilla default setup working. – WestCoastProjects Oct 02 '15 at 06:21
1
  1. Setup a Scala development environment with IntelliJ. See Scala - Getting Started.

    • JDK is required since Scala is a JVM language
    • sbt is the build tool
    • IntelliJ can be the IDE
  2. To the Scala environment, add the Spark dependency. See Spark - Getting Started.

    • Execute the application using spark-submit

The links provide simple working examples which you can extend to write your own application.

ap-osd
  • 2,624
  • 16
  • 16