I want to calculate RSI using .NET for Spark.
Formula for RSI is:
RSI = 100 - 100/(1 + R)S
RS = Average Gain / Average Loss
The first average gain and average loss are 14-period averages:
First Average Gain = Sum of Gains over the past 14 periods / 14.
First Average Loss = Sum of Losses over the past 14 periods / 14
All the next calculations are based on the prior averages and the current gain loss:
Average Gain = [(previous Average Gain) x 13 + current Gain] / 14.
Average Loss = [(previous Average Loss) x 13 + current Loss] / 14.
The data is in DataFrame rsiCalcPos5
and it looks like this:
+--------------------+-----+------+----+-----+------------------+-------------------+----------+------------------+--------------------+-------------------+-------------------+------------------+
| TimeSeriesType|Year0|Month0|Day0|Hour0| avg(Value)| Timestamp| UnixTime| nextValue| deltaValue| gain| loss| gain1|
+--------------------+-----+------+----+-----+------------------+-------------------+----------+------------------+--------------------+-------------------+-------------------+------------------+
|Current Available...| 2021| 3| 3| 9| 219.8235294117647|2021-03-03 09:00:00|1614758400|218.59733449857987| -1.2261949131848269| 0.0| 1.2261949131848269| 0.0|
|Current Available...| 2021| 3| 3| 10|218.59733449857987|2021-03-03 10:00:00|1614762000|185.59442632671212| -33.002908171867745| 0.0| 33.002908171867745| 0.0|
|Current Available...| 2021| 3| 3| 11|185.59442632671212|2021-03-03 11:00:00|1614765600| 190.5523781944545| 4.957951867742366| 4.957951867742366| 0.0|1.6526506225807889|
|Current Available...| 2021| 3| 3| 12| 190.5523781944545|2021-03-03 12:00:00|1614769200|187.88173813444055| -2.6706400600139375| 0.0| 2.6706400600139375|1.2394879669355916|
|Current Available...| 2021| 3| 3| 13|187.88173813444055|2021-03-03 13:00:00|1614772800| 187.6245558053521|-0.25718232908846517| 0.0|0.25718232908846517|0.9915903735484732|
|Current Available...| 2021| 3| 3| 14| 187.6245558053521|2021-03-03 14:00:00|1614776400|186.56644553819817| -1.0581102671539213| 0.0| 1.0581102671539213|0.8263253112903944|
|Current Available...| 2021| 3| 3| 15|186.56644553819817|2021-03-03 15:00:00|1614780000|186.66761484852796| 0.10116931032979437|0.10116931032979437| 0.0|0.7227315968674516|
|Current Available...| 2021| 3| 3| 16|186.66761484852796|2021-03-03 16:00:00|1614783600|165.79466929911155| -20.872945549416414| 0.0| 20.872945549416414|0.6323901472590201|
|Current Available...| 2021| 3| 3| 17|165.79466929911155|2021-03-03 17:00:00|1614787200|178.60478239401849| 12.810113094906939| 12.810113094906939| 0.0|1.9854704747754555|
|Current Available...| 2021| 3| 3| 18|178.60478239401849|2021-03-03 18:00:00|1614790800| 215.3916108565386| 36.786828462520106| 36.786828462520106| 0.0| 5.465606273549921|
|Current Available...| 2021| 3| 3| 19| 215.3916108565386|2021-03-03 19:00:00|1614794400|221.27369459516595| 5.882083738627358| 5.882083738627358| 0.0| 5.503467861284233|
|Current Available...| 2021| 3| 3| 20|221.27369459516595|2021-03-03 20:00:00|1614798000|231.88854705635575| 10.614852461189798| 10.614852461189798| 0.0| 5.92941657794303|
|Current Available...| 2021| 3| 3| 21|231.88854705635575|2021-03-03 21:00:00|1614801600|238.82354991634134| 6.9350028599855875| 6.9350028599855875| 0.0| 6.006769368869381|
|Current Available...| 2021| 3| 3| 22|238.82354991634134|2021-03-03 22:00:00|1614805200|240.02948909258865| 1.2059391762473126| 1.2059391762473126| 0.0| 5.663852926539233|
|Current Available...| 2021| 3| 3| 23|240.02948909258865|2021-03-03 23:00:00|1614808800|240.92351533915001| 0.8940262465613671| 0.8940262465613671| 0.0| null|
|Current Available...| 2021| 3| 4| 0|240.92351533915001|2021-03-04 00:00:00|1614812400|239.63160854893138| -1.2919067902186328| 0.0| 1.2919067902186328| null|
|Current Available...| 2021| 3| 4| 1|239.63160854893138|2021-03-04 01:00:00|1614816000|240.48959521094642| 0.8579866620150369| 0.8579866620150369| 0.0| null|
|Current Available...| 2021| 3| 4| 2|240.48959521094642|2021-03-04 02:00:00|1614819600|192.37784787942516| -48.11174733152126| 0.0| 48.11174733152126| null|
|Current Available...| 2021| 3| 4| 3|192.37784787942516|2021-03-04 03:00:00|1614823200|192.96993537510536| 0.5920874956802038| 0.5920874956802038| 0.0| null|
|Current Available...| 2021| 3| 4| 4|192.96993537510536|2021-03-04 04:00:00|1614826800|193.60104726861024| 0.6311118935048796| 0.6311118935048796| 0.0| null|
+--------------------+-----+------+----+-----+------------------+-------------------+----------+------------------+--------------------+-------------------+-------------------+------------------+
I have already calculated gain
and loss
and first average gain (gain1 = 5.663852926539233
, since the time interval for calculation of RSI is 14).
Now I have problems with calculating other average gains, from row 15 on. The formula is recursive and I am not sure how to implement it. So far, I tried window function, but I don't get the right result.
WindowSpec windowRSI3 = Microsoft.Spark.Sql.Expressions.Window
.PartitionBy("TimeSeriesType")
.OrderBy("Year0", "Month0", "Day0", "Hour0");
DataFrame rsiCalcPos6 = rsiCalcPos5.WithColumn("avgGainj", When(Col("gain1").IsNull(),
(Lag(Col("gain1"), 1, 0).Multiply(13 / 14).Minus((Col("gain").Multiply(-1 / 14))
.Over(windowRSI3)))).Otherwise(Col("gain1")));
Here, I get an exception:
org.apache.spark.sql.AnalysisException: Expression '(gain#175 * cast(0 as double))' not supported within a window function.
The recursive formula I want to use needs to calculate one avgGainj
at a time and use this result when calculating the next one avgGain(j+1)
.
Any suggestions will be appreciated. Thanks!