I am super new to PySpark and I am trying to get the difference between value within same id. I am using csv format for DataFrame.
For example, my dataset is like that:
+---+-----+
| id|value|
+---+-----+
| 1| 65|
| 1| 66|
| 1| 65|
| 2| 68|
| 2| 71|
+---+-----+
and I want something like this
+---+-----+----------+
| id|value|prev_value|
+---+-----+----------+
| 1| 65| null|
| 1| 66| 65|
| 1| 65| 66|
| 2| 68| 65|
| 2| 71| 68|
+---+-----+----------+
so that it will be easy to calculate the difference between value.