I am attempting to index a DataFrame
of the below schema in ElasticSearch using the elasticsearch-hadoop connector.
|-- ROW_ID: long (nullable = false)
|-- SUBJECT_ID: long (nullable = false)
|-- HADM_ID: long (nullable = true)
|-- CHARTDATE: date (nullable = false)
|-- CATEGORY: string (nullable = false)
|-- DESCRIPTION: string (nullable = false)
|-- CGID: integer (nullable = true)
|-- ISERROR: integer (nullable = true)
|-- TEXT: string (nullable = true)
When writing this DataFrame to ElasticSearch, the "CHARTDATE" field is being written as a long. According to the documentation for the connector I am using (shown below) DateType
fields in Spark should be written as string-formatted dates in ElasticSearch. As I was hoping to build some visualizaitons in Kibana leveraging the date fields, them being written as longs is proving problematic.
https://www.elastic.co/guide/en/elasticsearch/hadoop/6.4/spark.html
Code used to produce error
val elasticOptions = Map(
"es.nodes" -> esIP,
"es.port" -> esPort,
"es.mapping.id" -> primaryKey,
"es.index.auto.create" -> "yes",
"es.nodes.wan.only" -> "true",
"es.write.operation" -> "upsert",
"es.net.http.auth.user" -> esUser,
"es.net.http.auth.pass" -> esPassword,
"es.spark.dataframe.write.null" -> "true",
"es.mapping.date.rich" -> "true"
)
castedDF.saveToEs(index, elasticOptions)
Is there a step I am missing to have these values written as ES dates?