I would like to process data for each row. Let's say if we have collected "mpg" value for two "cyl" for 4 days. I would like to derive minimum mpg value relative to day)
Original Data
** day,cyl,mpg**
- 1,4,34.4
- 2,4,21.3
- 3,4,23.3
- 4,4,25.0
- 1,3,23.0
- 2,3,27.0
- 3,3,18.3
- 4,3,17.3
Expected Output
** day,cyl,mpg,min_mpg**
- 1,4,34.4,34.4
- 2,4,21.3,21.3
- 3,4,23.3,21.3
- 4,4,25.0,21.3
- 1,3,23.0,23.0
- 2,3,27.0,23.0
- 3,3,18.3,18.3
- 4,3,17.3,17.3
I have given a few thoughts as below...
For Loop processing (which is really not most efficient options)
APPLY and SHIFT function (retain minimum value from the previous row processing in a global variable and reset it to NA for each GROUP. I was unsuccessful to retain minimum mpg value into a global variable)
APPLY and SHIFT function (shift "-1" all the way up to row #1 for each row. Kind of putting a loop in APPLY function. This option might be doing lot more redundant processing)
I tried to use rowShift function as described in the below blog but my requirement is that I need to shift dynamically Use a value from the previous row in an R data.table calculation
Is there any "vectorized" option available? OR Traditional FOR LOOP will be the only option? I prefer option using base R (either data frame or data table)