Question:
In a dataframe, I want to create a new column as the indices of the next smaller value of an existing column.
For example, the data looks like this. It is already arranged in item, day
.
item day val
1 1 2 3
2 1 4 2
3 1 5 1
4 2 1 1
5 2 3 2
6 2 5 3
First I would like to use group_by(item)
in dplyr
to select the sub-dataframe of each item.
Then for row 1, I look down the rows and find that row 2 has a smaller val
. This is what I want, so I record the day
corresponding to that row. Similar for row 2.
Note that for row 3 and 6, they are the last rows of corresponding sub-dataframes, so there is no next smaller value. For row 4 and 5, there is no smaller val
when I look down the rows.
The dataframe with the new column should look like this.
item day val next.smaller.day
1 1 2 3 4
2 1 4 2 5
3 1 5 1 -1
4 2 1 1 -1
5 2 3 2 -1
6 2 5 3 -1
I wonder if there is any way of using dplyr
to implement this, or any codes in r
other than a for loop.
I found a thread asking the algorithm of this question. Given an array, find out the next smaller element for each element . It is relevant, and the proposed algorithm beats mine in terms of time complexity, but I still find it hard to implement in my scenario.
Thank you!
Update:
Here is another example to re-illustrate what I'm looking for.
item day val next.smaller.day
1 1 2 2 5
2 1 4 3 5
3 1 5 1 -1
4 2 1 3 3
5 2 3 1 -1
6 2 5 2 -1