1

I am writing spark applications in scala using IntelliJ IDEA and maven as build tool.

I deploy them in Azure HDInsight cluster. I have Azure Plugin for Intellij installed for that.

I use Event Hubs to stream data and perform some transformation before writing them to storage.

I am pretty new to all spark, scala, Intellij and Event Hubs.

I debug the programs in 2 different ways:

build jar (using mvn clean and mvn package) and use spark-submit to submit application to spark cluster click on small play button to the left of object having main function to execute the code

green-play-button

I have fair idea of what maven does - I think it gets the dependencies mentioned in pom.xml to some local location user's .m2 folder. These jars will be referenced while we do mvn package to check all referenced libraries for syntax then builds jar of the application.

I would like to understand how dependency is resolved in IntelliJ IDEA while running using second method.

  1. I am able to do mvn clean and mvn package. It cleaned, ran the test cases and built the jar. However in IDE, It showed red (not found) for method call for some methods. I could ctrl+click and go to EventData class decompiled from bytecode and verify that. However I checked in the jar listed in project pane External Libraries. The method existed in the jar. The jar which did not have the method was probably in some folder like .ivy

not-found-red

  1. I am able to do mvn clean and mvn package. IDE does not show any red marks for unavailable Value but when I try to run it using green play button, it shows error that value was not found. I can even ctrl+click and navigate to the class and see that it exists

not-found-not-red

Both errors are related to Event Hubs and one suggestion I found was that the jar referenced might be different from the required version and that I match the correct version of Event Hub to my spark version. I tried that as well with same results as above - passes in maven and fails in IDE.

<dependency>
    <groupId>com.microsoft.azure</groupId>
    <artifactId>azure-eventhubs-spark_2.11</artifactId>
    <version>${spark.version}</version>
</dependency>

I think maven uses my .m2 folder and jars inside it to build the project and IntelliJ uses something else (maybe ivy) to resolve dependencies in its development environment. Can anyone help me understand and solve this?

  1. Is there a way to know and tell IntelliJ which specific version of jar to use apart from mentioning in pom.xml?
  2. Is there a way to tell IntelliJ to use maven collected jars so that mvn package and IDE environment resolve dependencies using same jar?

1 Answers1

1

does IntelliJ IDEA maven project use the jars downloaded by maven to resolve the dependencies?

Yes it will...

Is there a way to know and tell IntelliJ which specific version of jar to use apart from mentioning in pom.xml?
Is there a way to tell IntelliJ to use maven collected jars so that mvn package and IDE environment resolve dependencies using same jar?

If you open maven project from existing sources and select build tool as maven automatically dependencies are understood by Intellij

enter image description here

There are several ways you can import a Maven project in IntelliJ IDEA. The most standard approach is to open the pom.xml file directly. You can do it in the welcome screen by clicking Open: enter image description here

By doing this all dependencies will be imported. and intellij also sets the jars in classpath..

Note : for the first time if you are opening the project you have to give intelli j some time for indexing.

You can try mvn idea:idea as well but i think its retired option. Above command will download the project plugins for IntelliJ. And, above command will also create the project file (.ipr), module file (.iml) and workspace file (.iws).

Finally if nothing works...

do like this in my answer .

Update :

mvn dependency:analyze will give complete list of dependency

mvn depdendency:tree will display all the direct and transitive dependencies as tree.

see Resolving conflicts using the dependency tree

Ram Ghadiyaram
  • 28,239
  • 13
  • 95
  • 121
  • Yes that was helpful and I am able to import all dependencies. My question however is even after I import all dependencies mentioned in maven, I get error. This is probably because IntelliJ is referencing different version of jar than I have specified in pom.xml. Not able to figure out why – Nagendra Ghimire Jul 22 '19 at 16:34
  • which jar file is that ? what ever you mention in the pom.xml it should reffer the same – Ram Ghadiyaram Jul 22 '19 at 16:38
  • 1
    there might be transitive(in direct) dependencies like jar1 depends on jar1.1 but jar 1.2 was mentioned in the pom.xml also.... if this is the case what ever you are telling might be possible. in normal case it should take all jars from pom.xml in to intellij classpath – Ram Ghadiyaram Jul 22 '19 at 16:39
  • jar file is azure-eventhubs-spark_2.11. When I look at the External Libraries in the project pane, the version matches and the method exists. However when I ctrl+click on the class, it takes me to decompiled jar older version where the method does not exist. – Nagendra Ghimire Jul 22 '19 at 16:42
  • can you [exclude older version of transitive jar](https://maven.apache.org/guides/introduction/introduction-to-optional-and-excludes-dependencies.html) to see latest version you mentioned is reflecting – Ram Ghadiyaram Jul 22 '19 at 17:07
  • mvn dependency:analyze will give complete list of dependency mvn depdendency:tree will display all the direct and transitive dependencies as tree – Ram Ghadiyaram Jul 22 '19 at 17:24