I wrote a code for daily batch operation in appache beam - data flow python code. I'm trying to make the date range runner. ie currently its running fine for yesterday - If I want to run from a start date to end date its difficult.Requesting you to suggest any method for that. Please find my code snippet for yesterday run.
start_date = '20180101'
end_date = '20190101'
p = beam.Pipeline(options=options)
read = (
p
| 'BQRead: ' >> BQReader(
query=test_query.format(date=date))
)
transformed = (
read
| 'Transform 1 ' >> beam.ParDo(Transform1())
)
transformed | 'BQWrite' >> BQWriter(table + date, table_schema)
I tried like below but its not working
start_date = datetime.strptime('20190101', "%Y%m%d")
end_date = datetime.strptime('20190110', "%Y%m%d")
dates = list(rrule.rrule(rrule.DAILY, dtstart=start_date, until=end_date))
for date in dates:
ds_nd = date.strftime('%Y%m%d')
p = beam.Pipeline(options=options)
read = (
p
| 'BQRead: ' >> BQReader(
query=test_query.format(date=ds_nd))
)
transformed = (
read
| 'Transform 1 ' >> beam.ParDo(Transform1())
)
transformed | 'BQWrite' >> BQWriter(table + ds_nd, table_schema)