Is it possible to generate an array with date format like the following datearray=["20190901","20190902"...…..,"20190930"]
I want to do if I input a date range, it will automatically generate . the array
using databricks python
Is it possible to generate an array with date format like the following datearray=["20190901","20190902"...…..,"20190930"]
I want to do if I input a date range, it will automatically generate . the array
using databricks python
I noticed that Sreeram's answer uses Pandas, which does not take advantage of Databricks capabilities.
Thus, I am suggesting a more Databricks native way of doing this:
spark.sql("SELECT sequence(to_date('2018-01-01'), to_date('2018-03-01'), interval 1 month) AS Date").show()
which returns a list like [2018-01-01,2018-02-01,2018-03-01] with column name Date.
You can then convert it using
from pyspark.sql.functions import to_date
You can make use of pandas
for this task like this,
start = '20190101'
end = '20190501'
[str(x).replace('-', '').split()[0] for x in pd.date_range(start=pd.Timestamp(start), end=pd.Timestamp(end), freq='1D')]
Instead of giving an end date if you want to give number of days, you can see this,
start = '20190101'
days = 100
[str(x).replace('-', '').split()[0] for x in pd.date_range(start=pd.Timestamp(start), end=pd.Timestamp(start) + pd.Timedelta(days=days), freq='1D')]