Class not found even when imported (Maven)

Question

Using IntelliJIdea and Maven, I'm trying to take in a csv table and convert it into a Hive Table (or a parquet would be fine as well for now). This is my current code:

import org.apache.spark.sql.SparkSession
import scala.io.Source
import org.apache.spark.sql.types._


object main extends App{
  val spark = SparkSession.builder.master("local").appName("my-spark-app").enableHiveSupport().getOrCreate()
  val lines = Source.fromFile("C://share_VB/file_name.csv").getLines.toArray
  //val myDF = spark.read.csv("C://share_VB/file_name.csv")
  //myDF.write.save("C://Users/my_name/ParquetFiles")
  for (line <- lines){
    if (!line.isEmpty){
      val testcase = line.split(",").toBuffer
      println(testcase.head)
      println(testcase(1))
      testcase.remove(0, 2)
      while (testcase.nonEmpty){
        println(testcase.head)
        println(testcase(1))
        testcase.remove(0, 2)
      }
    }
  }
}

the pom.xml file:

<?xml version="1.0" encoding="UTF-8"?>
    <project xmlns="http://maven.apache.org/POM/4.0.0"
             xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
             xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
        <modelVersion>4.0.0</modelVersion>

        <groupId>seeifthisworks</groupId>
        <artifactId>seeifthisworks</artifactId>
        <version>1.0-SNAPSHOT</version>

        <properties>
            <scala.version>2.11.8</scala.version>
            <scala.compat.version>2.11</scala.compat.version>
            <spark.version>2.2.0.cloudera1</spark.version>
            <config.version>1.3.2</config.version>
            <scalatest.version>3.0.1</scalatest.version>
            <spark-testing-base.version>2.2.0_0.8.0</spark-testing-base.version>

        </properties>

        <!-- set repositories first !!, so that dependencies use the URL for the repos -->
        <repositories>
            <repository>
                <id>Maven</id>
                <url>http://repo1.maven.org/maven2</url>
            </repository>
            <repository>
                <id>cloudera</id>
                <url>https://repository.cloudera.com/artifactory/cloudera-repos/</url>
            </repository>
        </repositories>

        <dependencies>
        <dependency>
            <groupId>org.apache.spark</groupId>
            <artifactId>spark-core_${scala.compat.version}</artifactId>
            <version>${spark.version}</version>
            <scope>provided</scope>
        </dependency>
        <dependency>
            <groupId>org.apache.spark</groupId>
            <artifactId>spark-sql_${scala.compat.version}</artifactId>
            <version>${spark.version}</version>
            <scope>provided</scope>
        </dependency>
        </dependencies>

    </project>

It runs perfectly if I comment out the val spark = SparkSession.... However, if I leave it there and I try to run anything, I run into the error:

Error: Unable to initialize main class main
Caused by: java.lang.NoClassDefFoundError: org/apache/spark/sql/SparkSession

But it seems fairly clear that I've imported SparkSession and Maven: org.apache.spark:spark-core_2.11:2.2.0.cloudera1 is in my library so in theory, I think it should work.

Can someone help me pinpoint the problem and explain how to fix this?

EDIT: After removing <scope>provided</scope>, I now encounter a different error:

Exception in thread "main" java.lang.NoSuchMethodError: scala.Predef$.refArrayOps([Ljava/lang/Object;)Lscala/collection/mutable/ArrayOps;
at org.apache.spark.util.Utils$.getCallSite(Utils.scala:1440)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:76)
at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2509)
at org.apache.spark.sql.SparkSession$Builder$$anonfun$6.apply(SparkSession.scala:909)
at org.apache.spark.sql.SparkSession$Builder$$anonfun$6.apply(SparkSession.scala:901)
at scala.Option.getOrElse(Option.scala:121)
at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:901)
at main$.delayedEndpoint$main$1(main.scala:7)
at main$delayedInit$body.apply(main.scala:6)
at scala.Function0.apply$mcV$sp(Function0.scala:34)
at scala.Function0.apply$mcV$sp$(Function0.scala:34)
at scala.runtime.AbstractFunction0.apply$mcV$sp(AbstractFunction0.scala:12)
at scala.App.$anonfun$main$1$adapted(App.scala:76)
at scala.collection.immutable.List.foreach(List.scala:389)
at scala.App.main(App.scala:76)
at scala.App.main$(App.scala:74)
at main$.main(main.scala:6)
at main.main(main.scala)

If you don't get an error while importing it then the dependency is not present at runtime. — philantrovert, May 29 '18 at 13:16
https://stackoverflow.com/questions/75947449/run-a-scala-code-jar-appear-nosuchmethoderrorscala-predef-refarrayops — Dmytro Mitin, Apr 07 '23 at 05:09

Class not found even when imported (Maven)

0 Answers0