1

I am trying to use SANSA-RDF for reading turtle RDF files into Spark and create a graph. I am getting an error when I execute the following code. What am I missing?

    import org.apache.jena.query.QueryFactory
    import org.apache.jena.riot.Lang
    import org.apache.spark.sql.SparkSession
    import net.sansa_stack.rdf.spark.io.rdf._
    import net.sansa_stack.rdf.spark.io._
    import scala.io.Source

    object SparkExecutor {
      private var ss:SparkSession = null

      def ConfigureSpark(): Unit ={

        ss = SparkSession.builder
          .master("local[*]")
          .config("spark.driver.cores", 1)
          .appName("LAM")
          .getOrCreate()

      }

      def createGraph(): Unit ={
        val filename = "xyz.ttl"
        print("Loading graph from file"+ filename)
        val lang = Lang.TTL
        val triples = ss.rdf(lang)(filename)
        val graph = LoadGraph(triples)    
      }
    }

I am calling the SparkExecutor from main function using

    object main {
      def main(args: Array[String]): Unit = {
        SparkExecutor.ConfigureSpark()
        val RDFGraph = SparkExecutor.createGraph()
      }
    }

This results in the following error

    Error: value rdf is not a member of org.apache.spark.sql.SparkSession
val triples = ss.rdf(lang)
Stanislav Kralin
  • 11,070
  • 4
  • 35
  • 58

1 Answers1

7

Well there is an implicit conversion, if you see the SANSA-RDF source code in

sansa-rdf-spark/src/main/scala/net/sansa_stack/rdf/spark/io/package.scala:159

rdf(lang) is not a method of spark session, but of implicit class RDFReader, so you need to import the package where the implicit definition is available. Please try adding

import net.sansa_stack.rdf.spark.io._

and let us know the result.

shuvomiah
  • 410
  • 2
  • 9
  • That's correct, a Scala package object was moved - clearly, this is change was done in the SNAPSHOT version, documentation will be updated in the next release. – UninformedUser Apr 01 '18 at 09:11
  • @shuvomiah, thank you for the suggestion. I have added the import as you suggested. I am still getting the same error. I am working with Intellij. It seems as if Intellij is not recognising the import (gray in colour). I have updated the code in the question. – basicknowledge Apr 01 '18 at 09:32
  • actually I pulled the latest code from github. Can you please search for the rdf method from RDFReader, and see in which package the class is declared ? It's definitely not an Intellij issue. Let us know. – shuvomiah Apr 01 '18 at 15:03
  • @shuvomiah, thank you for the response. The rdf method from RDFReader is defined in the package you mentioned **net.sansa_stack.rdf.spark.io** – basicknowledge Apr 01 '18 at 16:33
  • I pulled the code from GitHub now and built a local jar. I included this in the libraries and now my code works fine. I am sure its a bad way of solving the problem. – basicknowledge Apr 01 '18 at 17:13
  • @sukanyanath One more, time: don't use a SNAPSHOT dependency of this API, use the latest stable release. Or what's the reason for using the SNAPSHOT? – UninformedUser Apr 01 '18 at 17:59
  • @AKSW, Thank you for the response. I am not sure I understand you correctly. I am using the version **SANSA RDF Spark Bundle 0.3.0**. Do you suggest another version? – basicknowledge Apr 01 '18 at 19:22
  • @sukanyanath Great to know that. Can you please accept that as an answer ? – shuvomiah Apr 02 '18 at 06:22
  • @sukanyanath In version 0.3.0 the package class is still `net.sansa_stack.rdf.spark.io.rdf`, thus, it works as expected. I don't know how you use the SANSA RDF API and which version. – UninformedUser Apr 02 '18 at 11:06