I'm using spark structured streaming to process data from a streaming data source, and I'm using a file sink. Data will be put into hdfs after processing.
I've got a problem that output file is something like part-00012-8d701427-8289-41d7-9b4d-04c5d882664d-c000.txt
. This makes me impossible get files output during last hour.
Is is possible to customize the output file into timestamp_xxx or something like this? Or, can I output into different path by each batch?