I have a Spark DataFrame
representing energy consumption (in kW) of particular device in particular moment (datestamped). I would like to calculate energy consumption in kWh, it means calculate integral over this dataset for a given time interval. How can I accomplish it using Spark?
Asked
Active
Viewed 741 times
2

user3084736
- 103
- 6
-
1How do you want to estimate the energy consumption? What have you tried? – David Oct 19 '16 at 13:46
-
I would like to calculate sum of area of every trapezoid between two datapoints (areas below linear functions through two consecutive datapoints). I'm still searching and haven't try anything yet. – user3084736 Oct 19 '16 at 13:55
-
2Window functions should be able to help. https://databricks.com/blog/2015/07/15/introducing-window-functions-in-spark-sql.html. Here's an example of how to get lagged values (lead values would be similar) http://stackoverflow.com/questions/34295642/spark-add-new-column-to-dataframe-with-value-from-previous-row?rq=1 – David Oct 19 '16 at 14:53
-
Thank you! It solved my problem. – user3084736 Oct 20 '16 at 13:48