2

I develop the java pojo class which contains java.time.LocalDate member variable date.

import java.io.Serializable;
import java.time.LocalDate;

import org.apache.spark.sql.types.DataTypes;
import org.apache.spark.sql.types.StructField;
import org.apache.spark.sql.types.StructType;

@Data
@AllArgsConstructor
@NoArgsConstructor
public class EntityMySQL implements Serializable {
    
    @JsonFormat(pattern="yyyy-MM-dd")
    @JsonDeserialize(using = LocalDateDeserializer.class)
    private LocalDate date;
    
    private float value;
    
    private String id;
    
    private String title;

    private static StructType structType = DataTypes.createStructType(new StructField[] {
              
              DataTypes.createStructField("date", DataTypes.DateType, false),  // this line throws Exception
              DataTypes.createStructField("value", DataTypes.FloatType, false),
              DataTypes.createStructField("id", DataTypes.StringType, false),
              DataTypes.createStructField("title", DataTypes.StringType, false)
    });

As you see, the "date" member variable type is java.time.LocalDate. But in static structType variable, I set the type of date to DateTypes.DateType. When I bind the pojo class with spark data frame. it throws the error like below,

Caused by: java.lang.RuntimeException: java.time.LocalDate is not a valid external type for schema of date
    at org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection.StaticInvoke_0$(Unknown Source)
    at org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection.writeFields_0_0$(Unknown Source)
    at org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection.apply(Unknown Source)
    at org.apache.spark.sql.catalyst.encoders.ExpressionEncoder$Serializer.apply(ExpressionEncoder.scala:210)

When I set the date member variable to java.util.Date, the spark DataTypes.DateType is the correct configuration and there are no errors. But In case of using java.time.LocalDate, the codes does not work correctly and throws exceptions. If I have to generate custom Date type, kindly inform me how. Any idea?

Joseph Hwang
  • 1,337
  • 3
  • 38
  • 67

1 Answers1

1

java.time.LocalDate is not supported up to Spark even if you try to write an Encoder for the java Date type it will not work.

I advise you to convert java.time.LocalDate to some other supported type like java.sql.Timestamp or java.sql.Date or epoch or date-time in string.

itIsNaz
  • 621
  • 5
  • 11