4

So I am working on a code where I define dynamically a class at runtime reading its code from a .scala file like so :

val src = Source.fromFile("C:\\Users\\acer\\Desktop\\classes\\artport.scala").mkString  // get file containing class code
val tb = universe.runtimeMirror(getClass.getClassLoader).mkToolBox()
val clazz = tb.compile(tb.parse(src))().asInstanceOf[Class[_]]
val ctor = clazz.getDeclaredConstructors()(0)

then I instantiate the class and assign it to a dataFrame like that :

val df = rddtoinsert.map {
                case (v) => v.split(",")
              }.map(payload => { // instance of dynamic class
               ctor.newInstance(payload(0).toDouble: java.lang.Double, payload(1).toDouble: java.lang.Double, payload(2).toDouble: java.lang.Double, payload(3).toDouble: java.lang.Double, payload(4).toDouble: java.lang.Double, payload(5).toDouble: java.lang.Double, payload(6).toDouble: java.lang.Double, payload(7).toDouble: java.lang.Double, payload(8).toDouble: java.lang.Double, payload(9).toDouble: java.lang.Double)
              }).toDF(typedCols: _*)

When I execute it says :

value toDF is not a member of org.apache.spark.rdd.RDD[Any]
[error] possible cause: maybe a semicolon is missing before `value toDF'?
[error]               }).toDF(typedCols: _*)

I found that to resolve this, the class has to be defined outside of the main method, but I need mine to be defined inside of it because I can't know which class I will be using before executing my function

Any help would be appreciated, thanks

Avishek Bhattacharya
  • 6,534
  • 3
  • 34
  • 53
Ahlam AIS
  • 116
  • 2
  • 8

1 Answers1

4

toDf is an implicit. You need to import it by doing:

 import spark.implicits._

Also it seems that your RDD is of type Any, to do a toDF you need it to be an RDD[Row] and define the schema. See for example this answer:

Assaf Mendelson
  • 12,701
  • 5
  • 47
  • 56
  • How can I do this? How come it works if I create an instance of a case class that is declared before the main method? – Ahlam AIS Jan 29 '18 at 17:56
  • see the linked answer. If you use a case class then it implicitly converts it as it knows the schema. – Assaf Mendelson Jan 30 '18 at 06:10
  • When I follow the linked answer I still get this error: value toDF is not a member of org.apache.spark.rdd.RDD[org.apache.spark.sql.Row] – Ahlam AIS Feb 05 '18 at 20:55