I am trying to calculate moving average of price for last six months in pyspark.
Currently my table has 6month lagged date.
id dates lagged_6month price
1 2017-06-02 2016-12-02 14.8
1 2017-08-09 2017-02-09 16.65
2 2017-08-16 2017-02-16 16
2 2018-05-14 2017-11-14 21.05
3 2017-09-01 2017-03-01 16.75
Desired Results
id dates avg6mprice
1 2017-06-02 20.6
1 2017-08-09 21.5
2 2017-08-16 16.25
2 2018-05-14 25.05
3 2017-09-01 17.75
Sample code
from pyspark.sql.functions import col
from pyspark.sql import functions as F
df = sqlContext.table("price_table")
w = Window.partitionBy([col('id')]).rangeBetween(col('dates'),col('lagged_6month'))
RangeBetween does not seem to accept columns as argument in the window function.