2

I have a Spark DataFrame representing energy consumption (in kW) of particular device in particular moment (datestamped). I would like to calculate energy consumption in kWh, it means calculate integral over this dataset for a given time interval. How can I accomplish it using Spark?

user3084736
  • 103
  • 6
  • 1
    How do you want to estimate the energy consumption? What have you tried? – David Oct 19 '16 at 13:46
  • I would like to calculate sum of area of every trapezoid between two datapoints (areas below linear functions through two consecutive datapoints). I'm still searching and haven't try anything yet. – user3084736 Oct 19 '16 at 13:55
  • 2
    Window functions should be able to help. https://databricks.com/blog/2015/07/15/introducing-window-functions-in-spark-sql.html. Here's an example of how to get lagged values (lead values would be similar) http://stackoverflow.com/questions/34295642/spark-add-new-column-to-dataframe-with-value-from-previous-row?rq=1 – David Oct 19 '16 at 14:53
  • Thank you! It solved my problem. – user3084736 Oct 20 '16 at 13:48

0 Answers0