Create function based on condition of another column R

Question

I have a df attached and I would like to create a loop that would apply a specific sequence (set by the user in R) based on conditions in column "x9". I would like to be able to set the sequence myself so I can try different sequences for this data frame, I will explain more below.

I have a df of losses and wins for an algorithm. On the first instance of a win I want to take the value in "x9" and divide it by the sequence value. I want to keep iterating through the sequence values until a loss is achieved. Once a loss is achieved the sequence will restart, when "x9" <0 to be specific.

I would like to create the two columns in my example "Risk Control" and "Sequence". Ideally I would like the function to iterate through the entire data frame so I can compare the column "x9" to "Risk Control".

Sample Data:

structure(list(x1 = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10), x2 = c("2016.01.04 01:05", 
"2016.01.04 01:12", "2016.01.04 01:13", "2016.01.04 01:17", "2016.01.04 01:20", 
"2016.01.04 01:23", "2016.01.04 01:25", "2016.01.04 01:30", "2016.01.04 01:31", 
"2016.01.04 01:59"), x3 = c("buy", "close", "buy", "close", "buy", 
"close", "buy", "t/p", "buy", "close"), x4 = c(1, 1, 2, 2, 3, 
3, 4, 4, 5, 5), x5 = c(8.46, 8.46, 8.6, 8.6, 8.69, 8.69, 8.83, 
8.83, 9, 9), x6 = c(1.58873, 1.58955, 1.5887, 1.58924, 1.58862, 
1.58946, 1.58802, 1.58902, 1.58822, 1.58899), x7 = c(1.57873, 
1.57873, 1.5787, 1.5787, 1.57862, 1.57862, 1.57802, 1.57802, 
1.57822, 1.57822), x8 = c(1.58973, 1.58973, 1.5897, 1.5897, 1.58962, 
1.58962, 1.58902, 1.58902, 1.58922, 1.58922), x9 = c(0, 478.69, 
0, 320.45, 0, 503.7, 0, 609.3, 0, 478.19), x10 = c(30000, 30478.69, 
30478.69, 30799.14, 30799.14, 31302.84, 31302.84, 31912.14, 31912.14, 
32390.33), `Risk Control` = c(NA, 478.69, NA, 320.45, NA, 251.85, 
NA, 304.65, NA, 159.3966667), ...12 = c(NA, NA, NA, NA, NA, NA, 
NA, NA, NA, NA), Sequence = c(NA, 1, NA, 1, NA, 2, NA, 2, NA, 
3)), row.names = c(NA, -10L), class = c("tbl_df", "tbl", "data.frame"
), spec = structure(list(cols = list(x1 = structure(list(), class = c("collector_double", 
"collector")), x2 = structure(list(), class = c("collector_character", 
"collector")), x3 = structure(list(), class = c("collector_character", 
"collector")), x4 = structure(list(), class = c("collector_double", 
"collector")), x5 = structure(list(), class = c("collector_double", 
"collector")), x6 = structure(list(), class = c("collector_double", 
"collector")), x7 = structure(list(), class = c("collector_double", 
"collector")), x8 = structure(list(), class = c("collector_double", 
"collector")), x9 = structure(list(), class = c("collector_double", 
"collector")), x10 = structure(list(), class = c("collector_double", 
"collector")), `Risk Control` = structure(list(), class = c("collector_double", 
"collector")), ...12 = structure(list(), class = c("collector_logical", 
"collector")), Sequence = structure(list(), class = c("collector_double", 
"collector"))), default = structure(list(), class = c("collector_guess", 
"collector")), delim = ","), class = "col_spec"))

In short I need assistance in:

1.Constructing a sequence to apply to my df, would like to be able to alter this sequence to try different sequences;

2.Take values in "x9" and create a new column that would apply the sequence values set. The sequence is taking the value in "x9" and dividing it by the sequence number

3.Construct a loop to iterate through the entire df to apply this over all of the values of the dataframe.

In the example above I have manually created "Risk Control" and the sample "Sequence". The sequence in the example is 1,1,2,2,3,3,4. The sequence in the sample uses each number twice before iterating to the next number. Once a loss is achieved in "x9" the sequence restarts.

I would appreciate any help with this function and loop. Thank you

I am open to any solutions suggested those were just some of my thoughts as I am still premature in learning the language — DBT, Nov 12 '20 at 18:11
I have updated the data in the sample data provided, I hope this is helpful — DBT, Nov 12 '20 at 18:18
I just want to be clear on terminology: "when a loss is achieved" - you mean when `x9` is < 0? "*Once a loss is achieved the sequence will restart*" so the `x9` value **after the first loss** is the one that restarts the sequence? And lastly, we don't care about any of the other columns, right? Input is `x9` (in it's particular order, so maybe `x1` sorta matters), desired output is `Risk Control` and `Sequence`? — Gregor Thomas, Nov 12 '20 at 18:19
Oh, and any special handling if there are multiple losses in a row? The sequence keeps resetting, so they all just keep getting divided by the first value of the sequence, right? — Gregor Thomas, Nov 12 '20 at 18:20
I think it's necessary to describe the data. Apparently something to do with stock prices, purchases and I don't know if close means sell or what t/p means. — markhogue, Nov 12 '20 at 18:30
Thank you for the clarification. So yes if a loss occurs in x9, if x9 is <0 then the sequence will restart. If there are multiple losses in a row we will use the greatest value in the sequence for the next 3 occurrences before the sequence restarts. Meaning if we have a sequence occurring, then there is loss in x9 of -100 followed by -100, we will start the sequence at 4 for the next 3 occurrences before the sequence fully restarts. We want to wait for 3 consecutive wins after consecutive losses — DBT, Nov 12 '20 at 18:33
That is correct, we just care about "x9" really. The other columns are just price data. x1 = unique ID, x2=Date/Time, x3=position (buy or sell) t/p (take profit), s/l (stop loss), x4= tradeID x5=sizing, x6=current price, x7=take profit price, x8=stop loss price, x9= win or loss — DBT, Nov 12 '20 at 18:36
I got stuck and may post a separate question to figure it out, but done for the day. My problem is filling in all the sequential data. First, had to simplify the data to only the rows with non-zero x9. I can easily make the first row sequence 1 and for any negative x9 row, but after getting the next row's sequence to 2, I'm stuck. This looks promising [link](https://stackoverflow.com/questions/48868104/recursive-function-using-dplyr) — markhogue, Nov 12 '20 at 22:37
This does look promising. That is some of the same issues I ran into with trying to fill the sequential data in the df. Thank you for the research and help I will investigate this link, reviewing more and follow up if I find a solution — DBT, Nov 13 '20 at 04:44

score 1 · Accepted Answer · answered Nov 16 '20 at 14:18

Starting with input data only (not desired columns)

df1 <- df %>% select(1:10)

Reducing this data to only data with x9 not zero This may not be intended and the user may prefer to key off an x3 event, but hopefully is illustrative.

df1 <- df1 %>% filter(x9 != 0)

Initiate seq column and insert dummy data.

df1$seq <- c(1, NA, 1, NA, NA)

Fill in, thanks to Allan Cameron for this answer to my post link

df1$seq <- unlist(sapply(diff(c(which(!is.na(df1$seq)), nrow(df1) + 1)), seq))

Apply user's rule 2:

df1$risk_control <- df1$x9 / df1$seq

# A tibble: 5 x 12
     x1 x2            x3       x4    x5    x6    x7    x8    x9    x10   seq risk_control
  <dbl> <chr>         <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>  <dbl> <int>        <dbl>
1     2 2016.01.04 0~ close     1  8.46  1.59  1.58  1.59  479. 30479.     1         479.
2     4 2016.01.04 0~ close     2  8.6   1.59  1.58  1.59  320. 30799.     2         160.
3     6 2016.01.04 0~ close     3  8.69  1.59  1.58  1.59  504. 31303.     1         504.
4     8 2016.01.04 0~ t/p       4  8.83  1.59  1.58  1.59  609. 31912.     2         305.
5    10 2016.01.04 0~ close     5  9     1.59  1.58  1.59  478. 32390.     3         159.

Recombining this with the original data can be performed if desired with:

df2 <- dplyr::left_join(df[, -c(11:13)], df1)

# A tibble: 10 x 12
      x1 x2           x3       x4    x5    x6    x7    x8    x9    x10   seq risk_control
   <dbl> <chr>        <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>  <dbl> <int>        <dbl>
 1     1 2016.01.04 ~ buy       1  8.46  1.59  1.58  1.59    0  30000     NA          NA 
 2     2 2016.01.04 ~ close     1  8.46  1.59  1.58  1.59  479. 30479.     1         479.
 3     3 2016.01.04 ~ buy       2  8.6   1.59  1.58  1.59    0  30479.    NA          NA 
 4     4 2016.01.04 ~ close     2  8.6   1.59  1.58  1.59  320. 30799.     2         160.
 5     5 2016.01.04 ~ buy       3  8.69  1.59  1.58  1.59    0  30799.    NA          NA 
 6     6 2016.01.04 ~ close     3  8.69  1.59  1.58  1.59  504. 31303.     1         504.
 7     7 2016.01.04 ~ buy       4  8.83  1.59  1.58  1.59    0  31303.    NA          NA 
 8     8 2016.01.04 ~ t/p       4  8.83  1.59  1.58  1.59  609. 31912.     2         305.
 9     9 2016.01.04 ~ buy       5  9     1.59  1.58  1.59    0  31912.    NA          NA 
10    10 2016.01.04 ~ close     5  9     1.59  1.58  1.59  478. 32390.     3         159.

This is what I was looking for. Appreciate Allan Cameron and you Mark for your help! — DBT, Nov 16 '20 at 23:59

Create function based on condition of another column R

1 Answers1