-2

I am working on a project in R which is fairly code heavy at least compared to my previous R projects. The code is using multiple ifelse statements on previous columns data then creating a new column with the results. As the data I am using is a 5 minute timeframe, therefore I have to write a new line of code for every 5 minute slice of time. The data I have is from 09:30 to 16:00 so that is a lot of lines of code, around 75 by my calculations. Example of my data;

    Date                  Open        High       Low         Close      doy
1   2015-09-21 09:30:00 164.6700    164.7100    164.3700    164.5300    264
2   2015-09-21 09:35:00 164.5300    164.9000    164.5300    164.6400    264
3   2015-09-21 09:40:00 164.6600    164.8900    164.6000    164.8900    264
4   2015-09-21 09:45:00 164.9100    165.0900    164.9100    164.9736    264
5   2015-09-21 09:50:00 164.9399    165.0980    164.8200    164.8200    264

This data is then filtered onto a table like this;

data <- structure(list(doy = c(264, 265, 266, 267, 268, 271, 272, 11,12, 13), Date = structure(c(1442824200, 1442910600, 1442997000,1443083400, 1443169800, 1443429000, 1443515400, 1452504600, 1452591000,1452677400), class = c("POSIXct", "POSIXt"), tzone = ""), Or_High = c(164.71,162.96, 163.38, 161.37, 163.91, 162.06, 160.22, 164.5, 165.23,165.84), OR_Low = c(164.37, 162.62, 162.98, 161.06, 163.57, 161.66,159.7, 164.06, 164.84, 165.4), HOD = c(165.56, 163.36, 163.38,162.24, 164.43, 162.06, 160.96, 164.5, 165.78, 165.84), LOD = c(165.22,163.1, 162.98, 161.95, 164.24, 161.66, 160.75, 164.06, 165.56,165.4), Close = c(164.92, 163.02, 162.58, 161.85, 162.94, 159.84,160.19, 163.83, 165.02, 161.38), Range = c(0.340000000000003,0.260000000000019, 0.400000000000006, 0.29000000000002, 0.189999999999998,0.400000000000006, 0.210000000000008, 0.439999999999998, 0.219999999999999,0.439999999999998), `A-val` = c(NA, NA, NA, NA, NA, NA, NA, 0.0673439999999994,0.0659639999999996, 0.0729499999999996), `A-up` = c(NA, NA, NA,NA, NA, NA, NA, 164.567344, 165.295964, 165.91295), `A-down` = c(NA,NA, NA, NA, NA, NA, NA, 163.992656, 164.774036, 165.32705), `09:35` = structure(c(NA,NA, NA, NA, NA, NA, NA, 0, 0, 0), .Dim = c(10L, 1L), .Dimnames = list(NULL, "Low")), `09:40` = structure(c(NA, NA, NA, NA, NA,NA, NA, -1, 1, 0), .Dim = c(10L, 1L), .Dimnames = list(NULL,"Low")), `09:45` = structure(c(NA, NA, NA, NA, NA, NA, NA,0, 1, 0), .Dim = c(10L, 1L), .Dimnames = list(NULL, "Low")),`09:50` = structure(c(NA, NA, NA, NA, NA, NA, NA, -1, 1,0), .Dim = c(10L, 1L), .Dimnames = list(NULL, "Low")), `09:55` = structure(c(NA,NA, NA, NA, NA, NA, NA, -1, 0, 0), .Dim = c(10L, 1L), .Dimnames = list(NULL, "Low")), `10:00` = structure(c(NA, NA, NA, NA,NA, NA, NA, -1, 0, 0), .Dim = c(10L, 1L), .Dimnames = list(NULL, "Low")), `10:05` = structure(c(NA, NA, NA, NA,NA, NA, NA, -1, 0, 0), .Dim = c(10L, 1L), .Dimnames = list(NULL, "Low")), `10:10` = structure(c(NA, NA, NA, NA,NA, NA, NA, -1, 0, 0), .Dim = c(10L, 1L), .Dimnames = list(NULL, "Low")), `10:15` = structure(c(NA, NA, NA, NA,NA, NA, NA, -2, 0, -1), .Dim = c(10L, 1L), .Dimnames = list(NULL, "Low")), `10:20` = structure(c(NA, NA, NA, NA,NA, NA, NA, 0, 0, -1), .Dim = c(10L, 1L), .Dimnames = list(NULL, "Low")), `10:25` = structure(c(NA, NA, NA, NA,NA, NA, NA, -2, -1, -1), .Dim = c(10L, 1L), .Dimnames = list(NULL, "Low")), `10:30` = structure(c(NA, NA, NA, NA,NA, NA, NA, 0, 0, -1), .Dim = c(10L, 1L), .Dimnames = list(NULL, "Low")), `10:35` = structure(c(NA, NA, NA, NA,NA, NA, NA, 0, 0, -1), .Dim = c(10L, 1L), .Dimnames = list(NULL, "Low")), `10:40` = structure(c(NA, NA, NA, NA,NA, NA, NA, 0, -1, -2), .Dim = c(10L, 1L), .Dimnames = list(NULL, "Low")), `10:45` = structure(c(NA, NA, NA, NA,NA, NA, NA, 0, -1, 0), .Dim = c(10L, 1L), .Dimnames = list(NULL, "Low")), `10:50` = structure(c(NA, NA, NA, NA,NA, NA, NA, -1, -1, -2), .Dim = c(10L, 1L), .Dimnames = list(NULL, "Low")), `10:55` = structure(c(NA, NA, NA, NA,NA, NA, NA, -1, -1, 0), .Dim = c(10L, 1L), .Dimnames = list(NULL, "Low"))), .Names = c("doy", "Date", "Or_High","OR_Low", "HOD", "LOD", "Close", "Range", "A-val", "A-up", "A-down","09:35", "09:40", "09:45", "09:50", "09:55", "10:00", "10:05","10:10", "10:15", "10:20", "10:25", "10:30", "10:35", "10:40","10:45", "10:50", "10:55"), row.names = c(1L, 2L, 3L, 4L, 5L,6L, 7L, 78L, 79L, 80L), class = "data.frame") 

This is what the lines of code looks like;

data[,14] <- ifelse(df %>% filter(hour(Date) == 09 & minute(Date) == 45) %>% select(Low) > data[,10], 1, ifelse(df %>% filter(hour(Date) == 09 & minute(Date) == 45) %>% select(High) < data[,11], -1, 0))

Then the next line of code would look like;

data[,15] <- ifelse(df %>% filter(hour(Date) == 09 & minute(Date) == 50) %>% select(Low) > data[,10], 1, ifelse(df %>% filter(hour(Date) == 09 & minute(Date) == 50) %>% select(High) < data[,11], -1, 0))

And the next like this etc;

data[,16] <- ifelse(df %>% filter(hour(Date) == 09 & minute(Date) == 55) %>% select(Low) > data[,10], 1, ifelse(df %>% filter(hour(Date) == 09 & minute(Date) == 55) %>% select(High) < data[,11], -1, 0))

As you can see with each new line of code only certain parts of the code are changed, such as the hours, minutes and column references for summing. Perhaps the below example will make it clearer.

Example;

colnames(data)[14] <- "09:45"
colnames(data)[15] <- "09:50" 
colnames(data)[16] <- "09:55"
colnames(data)[17] <- "10:00"
colnames(data)[18] <- "10:05"

In this code would there be anyway to change the [#col ref#] and times without individually changing each line of code by hand? I realise that copy and paste can be used with notepad but that still means having write the individual changes. My main concern is not about the time taken to write this but moreover the risk of errors from human input.

If anyone has any tips or tricks as to how this can be done, or another way of achieving the same without using multiple if statements on the structure of my existing code I would be most grateful for your help. This question is related to previous question I posted here and may add clarity for what I am trying to achieve.

Thanks.

Community
  • 1
  • 1
redbaron1981
  • 407
  • 3
  • 9
  • 4
    I don't fully understand your question, but I would recommend creating a function with parameters if you find yourself having to so very similar things over and over again. – Tim Biegeleisen Jan 17 '16 at 10:37
  • 3
    In your `ifelse` code you are using columnnumbers which don't exist in your data. Please fix your example. It will help a lot if you provide the following elements in your question: input, desired output & what you have tried so far. – Jaap Jan 17 '16 at 11:09
  • You mentioned using Notepad for writing code and personally I really like working with Notepad++. It is a very powerful code editor and has nice functionality for copy-pasting and replacing. there is third party software called "NppToR" which I use in combination with Notepad++. – vanao veneri Jan 17 '16 at 11:30
  • 1
    Still not clear what you are trying to do. Please read [Ask] and how to give a [reproducible example](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example/5963610) – Jaap Jan 17 '16 at 11:51
  • The real answer to your question is, "Don't." What you're doing is extremely error-prone and difficult to change and maintain. Just don't do it. Future you will hate past you if you do this. – Joshua Ulrich Jan 17 '16 at 16:52
  • @ Joshua Ulrich can you recommend a better way of achieving the same result? Would it be better to write a function instead? – redbaron1981 Jan 17 '16 at 18:50

1 Answers1

-1

As vanao veneri mentioned it is better to use a text editor for writing bulk code quickly.

I found that Sublime 3 with Text Pastry add-on did exactly what I needed using the insert nuns command.

Thanks for the help.

redbaron1981
  • 407
  • 3
  • 9