0

I'm trying to run a spark job on a datastax cluster. Jar file was compiled and built with sbt. But getting such error:

ERROR 2015-11-02 16:34:36 org.apache.spark.streaming.scheduler.JobScheduler: Error running job streaming job 1446482076000 ms.0 scala.reflect.internal.MissingRequirementError: object analytics.lib.database.package not found.
at scala.reflect.internal.MissingRequirementError$.signal(MissingRequirementError.scala:16) ~[scala-reflect-2.10.5.jar:na]
at scala.reflect.internal.MissingRequirementError$.notFound(MissingRequirementError.scala:17) ~[scala-reflect-2.10.5.jar:na]
at scala.reflect.internal.Mirrors$RootsBase.ensureModuleSymbol(Mirrors.scala:126) ~[scala-reflect-2.10.5.jar:na]
at scala.reflect.internal.Mirrors$RootsBase.staticModule(Mirrors.scala:161) ~[scala-reflect-2.10.5.jar:na]
at scala.reflect.internal.Mirrors$RootsBase.staticModule(Mirrors.scala:21) ~[scala-reflect-2.10.5.jar:na]
at analytics.app.AbstractIncomingBuzzes$$anonfun$2$$typecreator3$1.apply(IncomingBuzzes.scala:96) ~[analytics-assembly-0.1-SNAPSHOT.jar:0.1-SNAPSHOT]
at scala.reflect.api.TypeTags$WeakTypeTagImpl.tpe$lzycompute(TypeTags.scala:231) ~[scala-reflect-2.10.5.jar:na]
at scala.reflect.api.TypeTags$WeakTypeTagImpl.tpe(TypeTags.scala:231) ~[scala-reflect-2.10.5.jar:na]
at com.datastax.spark.connector.mapper.ColumnMapper$$typecreator1$1.apply(ColumnMapper.scala:54) ~[spark-cassandra-connector_2.10-1.4.0.jar:1.4.0]
at scala.reflect.api.TypeTags$WeakTypeTagImpl.tpe$lzycompute(TypeTags.scala:231) ~[scala-reflect-2.10.5.jar:na]
at scala.reflect.api.TypeTags$WeakTypeTagImpl.tpe(TypeTags.scala:231) ~[scala-reflect-2.10.5.jar:na]
at com.datastax.spark.connector.mapper.TupleColumnMapper.<init>(TupleColumnMapper.scala:12) ~[spark-cassandra-connector_2.10-1.4.0.jar:1.4.0]
at com.datastax.spark.connector.mapper.ColumnMapper$.tuple1ColumnMapper(ColumnMapper.scala:54) ~[spark-cassandra-connector_2.10-1.4.0.jar:1.4.0]
at analytics.app.AbstractIncomingBuzzes$$anonfun$2.analytics$app$AbstractIncomingBuzzes$$anonfun$$eachRdd$1(IncomingBuzzes.scala:96) ~[analytics-assembly-0.1-SNAPSHOT.jar:0.1-SNAPSHOT]
at analytics.app.AbstractIncomingBuzzes$$anonfun$2$$anonfun$apply$mcV$sp$1.apply(IncomingBuzzes.scala:80) ~[analytics-assembly-0.1-SNAPSHOT.jar:0.1-SNAPSHOT]
at analytics.app.AbstractIncomingBuzzes$$anonfun$2$$anonfun$apply$mcV$sp$1.apply(IncomingBuzzes.scala:80) ~[analytics-assembly-0.1-SNAPSHOT.jar:0.1-SNAPSHOT]
at org.apache.spark.streaming.dstream.DStream$$anonfun$foreachRDD$1$$anonfun$apply$mcV$sp$3.apply(DStream.scala:631) ~[spark-streaming_2.10-1.4.1.1.jar:1.4.1.1]
at org.apache.spark.streaming.dstream.DStream$$anonfun$foreachRDD$1$$anonfun$apply$mcV$sp$3.apply(DStream.scala:631) ~[spark-streaming_2.10-1.4.1.1.jar:1.4.1.1]
at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(ForEachDStream.scala:42) ~[spark-streaming_2.10-1.4.1.1.jar:1.4.1.1]
at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1$$anonfun$apply$mcV$sp$1.apply(ForEachDStream.scala:40) ~[spark-streaming_2.10-1.4.1.1.jar:1.4.1.1]
at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1$$anonfun$apply$mcV$sp$1.apply(ForEachDStream.scala:40) ~[spark-streaming_2.10-1.4.1.1.jar:1.4.1.1]
at org.apache.spark.streaming.dstream.DStream.createRDDWithLocalProperties(DStream.scala:399) ~[spark-streaming_2.10-1.4.1.1.jar:1.4.1.1]
at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply$mcV$sp(ForEachDStream.scala:40) ~[spark-streaming_2.10-1.4.1.1.jar:1.4.1.1]
at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply(ForEachDStream.scala:40) ~[spark-streaming_2.10-1.4.1.1.jar:1.4.1.1]
at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply(ForEachDStream.scala:40) ~[spark-streaming_2.10-1.4.1.1.jar:1.4.1.1]
at scala.util.Try$.apply(Try.scala:161) ~[scala-library-2.10.5.jar:na]
at org.apache.spark.streaming.scheduler.Job.run(Job.scala:34) ~[spark-streaming_2.10-1.4.1.1.jar:1.4.1.1]
at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler$$anonfun$run$1.apply$mcV$sp(JobScheduler.scala:193) ~[spark-streaming_2.10-1.4.1.1.jar:1.4.1.1]
at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler$$anonfun$run$1.apply(JobScheduler.scala:193) ~[spark-streaming_2.10-1.4.1.1.jar:1.4.1.1]
at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler$$anonfun$run$1.apply(JobScheduler.scala:193) ~[spark-streaming_2.10-1.4.1.1.jar:1.4.1.1]
at scala.util.DynamicVariable.withValue(DynamicVariable.scala:57) ~[scala-library-2.10.5.jar:na]
at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler.run(JobScheduler.scala:192) ~[spark-streaming_2.10-1.4.1.1.jar:1.4.1.1]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) ~[na:1.7.0_80]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) ~[na:1.7.0_80]
at java.lang.Thread.run(Thread.java:745) ~[na:1.7.0_80]

Any ideas what is wrong there? Tweaks like MissingRequirementError with spark and why-does-sbt-build-fail-with-missingrequirementerror did not help.

Community
  • 1
  • 1
  • Do you get this error building with SBT or do you get it on spark submit? Can you post some code -- a minimum verifiable example of the issue? – phact Nov 04 '15 at 16:15
  • @phact SBT works fine. I get the error when do spark submit for generated jar. There is written own package `analytics.lib.database.package` for tables definitions. Called class `IncomingBuzzes.scala:96 ` imports this package. And spark submit fails here `val withEventInfo = databaseOps.withCassandraJoinType( input keyBy (_.eventId), Keyspace.api, Table.Api.event, "event_id" )((_, buzz, eventInfo: EventAnalyticsStats) => (buzz, eventInfo)).cache`. `Keyspace.api`, `Table.Api.event` are from `analytics.lib.database.package` – ivan partsianka Nov 04 '15 at 22:48
  • 96th line is `)((_, buzz, eventInfo: EventAnalyticsStats) =>` – ivan partsianka Nov 04 '15 at 22:50
  • ```def withCassandraJoinType[ K : ClassTag, T : ClassTag, U <: Serializable : ClassTag, R : ClassTag]( input: RDD[(K, T)], keyspace: String, table: String, joinOn: ColumnRef)( f: (K, T, U) => R)( implicit rwf: RowWriterFactory[Tuple1[K]], rrf: RowReaderFactory[U]) : RDD[R] ``` – ivan partsianka Nov 04 '15 at 22:53
  • Did you define EventAnalyticsStats? – phact Nov 05 '15 at 04:32
  • Sure. That is a class which uses `com.datastax.spark.connector.SomeColumns` - `object EventAnalyticsStats { val columnSelector = SomeColumns(... ) }` I think otherwise it'd not be compilable with SBT. – ivan partsianka Nov 05 '15 at 16:34
  • Were you able to solve the problem ? Still facing the issue with Spark - 1.6.2 and datastax 1.6.0-M1 – Aastha Jan 20 '17 at 06:13

0 Answers0