timestampbefore timestamp after @pissall's code I have a Timestamp column with 0.5Hz frequency, that results in millions of rows. I am willing to reduce this data size by having a timestamp in an hourly manner. i.e 24 observations for a particular day. I already reduced the data size by filtering the data by year, month and day. but as it is still very big i want to reduce it now to hourly basis.
I am working on Databricks and using PySpark for the same.
i used following command to reduce my data size from years to a Day.
df = df.filter(df.Timestamp.between('2019-09-03 00:00:00','2019-09-04 00:00:00'))
I would appreciate your help. Thanks