1

I have a DataFrame[SimpleType]. SimpleType is a class that contains 16 fields. But I have to change it into DataFrame[ComplexType].

I've got only schema of ComplexType(there is more than 400 fields), there is no case class for this type. I know mapping neccesary fields (but I don't know how to map it from DataFrame[SimpleType] -> DataFrame[ComplexType]), the rest fields I want to leave as nulls. Does anyone know how to do this in most efficent way?

Thanks

edit

class SimpleType{
field1
field2
field3
field4
.
.
.
field16
}

I have got DataFrame that contains this simple type. Also I have a schema of complex type. I want to convert this DataFrame[SimpleType] -> Dataframe[ComplexType]

Community
  • 1
  • 1
Tomasz
  • 135
  • 1
  • 14

1 Answers1

0

It's quite simple:

// function to get field names
import scala.reflect.runtime.universe._


def classAccessors[T: TypeTag]: List[String] = typeOf[T].members.collect {
    case m: MethodSymbol if m.isCaseAccessor => m}
.toList.map(s => s.name.toString)

val typeComplexFields = classAccessors[ComplexType]
val newDataFrame = simpleDF
                   .select(typeComplexFields
                            .map(c => if (simpleDF.columns.contains(c)) col(c) else lit(null).as(c)) : _*)
.as[ComplexType]

Credits also for author of Scala. Get field names list from case class, I've copied his function to get field names with modifications

T. Gawęda
  • 15,706
  • 4
  • 46
  • 61
  • @Tomasz I've undeleted my answer, please check it again if it was not working previously :) – T. Gawęda Sep 06 '17 at 13:24
  • Thanks for answer. Right, I dont have case class od ComplexType, since it have more than 400 hundreds fields. I've got only schema for that ComplexType, and the fields names are different. – Tomasz Sep 06 '17 at 16:28