My paths are of the format s3://my_bucket/timestamp=yyyy-mm-dd HH:MM:SS/
.
E.g. s3://my-bucket/timestamp=2021-12-12 12:19:27/
, however MM:SS part are not predictable, and I am interested in reading the data for a given hour. I tried the following:
df = spark.read.parquet("s3://my-bucket/timestamp=2021-12-12 12:*:*/")
df = spark.read.parquet("s3://my-bucket/timestamp=2021-12-12 12:[00,01-59]:[00,01-59]/")
but they give the error pyspark.sql.utils.IllegalArgumentException: java.net.URISyntaxException
.