I have a kinesis firehose delivery stream that puts data to S3. However in the data file the json objects has no separator between it. So it looks something like this,
{
"key1" : "value1",
"key2" : "value2"
}{
"key1" : "value1",
"key2" : "value2"
}
In Apache Spark I am doing this to read the data file,
df = spark.read.schema(schema).json(path, multiLine=True)
This can read only the first json object in the file and the rest neglected because there is no seperator.
How can I use resolve this issue in spark?