I'm generating a DataSet<Person>
like so:
DataSet<Person> personDs = sparkSession.read().json("people.json").as(Encoders.bean(Person.class));
where Person
is
class Person {
private String name;
private String placeOfBirth;
//Getters and setters
...
}
If my input data only contains a name ({"name" : "bob"}
), I get an error org.apache.spark.sql.AnalysisException: cannot resolve 'placeOfBirth' given input columns: [name]
.
Is there any way for me to tell Spark that placeOfBirth
(or any other field) can be null
?