I am trying to read data from csv using Scala and Spark but the values of columns are null.
I tried to read data from csv. I also provided a schema for querying the data easily.
private val myData= sparkSession.read.schema(createDataSchema).csv("data/myData.csv")
def createDataSchema = {
val schema = StructType(
Array(
StructField("data_index",StringType, nullable = false),
StructField("property_a",IntegerType, nullable = false),
StructField("property_b",IntegerType, nullable = false),
//some other columns
)
)
schema
Querying data:
val myProperty= accidentData.select($"property_b")
myProperty.collect()
I expect that the data are returned as a List of certain values
but they are returned as a list containing null values (values are null). Why?
When I print the schema then nullable is set to true instead of false.
I am using Scala 2.12.9 and Spark 2.4.3.