-4

trying to keep it as simple as possible

Consider this simple excel formula. Lets presume that I'm in cell C2 currently and it holds this formula. =if(A2=1,B2,C1)

i'm stuck at the referencing part. is there any way to do it?

Henry14
  • 1
  • 1
  • 2
    Read about indexing of matrix-elements. The R-function you want to use is `ifelse()` – jogo Jan 26 '16 at 08:20
  • 3
    Please read the info on [Ask] and how to give a [reproducible example](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example/5963610) – Jaap Jan 26 '16 at 08:24

1 Answers1

3

You have to remember that R is not a spreadsheet such as Excel. Therefor, R does not refer to cell ID's in the same way. However, the following code reproduces your situation:

library(dplyr)
example_data = data.frame(A = sample(c(0, 1), 10, replace = TRUE), B = 6:15, C = 21:30)
example_data %>% mutate(new_column = ifelse(A == 1, B, c(NA, C[1:(length(C) - 1)])))

Here, example_data is a so called data.frame, which is equivalent to the contents of the spreadsheet. In the second line we create a new column which uses the same logic as the Excel formula you provided.

Have a look at the different things going on here, try and understand what happens. If you get stuck, I would recommend you read up on some R tutorials (for example) and come back to this example.

Paul Hiemstra
  • 59,984
  • 12
  • 142
  • 149
  • 4
    Why can't you just do it using `transform` as in `transform(example_data, new_column = ifelse(A == 1, B, c(NA, C[1:(length(C) - 1)])))`? What's `dplyr` used for? Or if you already using `dplyr` you could just do `example_data %>% mutate(new_column = ifelse(A == 1, B, lag(C)))` probably – David Arenburg Jan 26 '16 at 08:41
  • 1
    @DavidArenburg The lag suggestion is indeed a good one, thanks. Other than that, the use of `dplyr` here is a personal preference, and fits nicely with the intended use of `dplyr`. If I would tackle this problem, this is the code I would use. So, this is also the answer I provide. Indeed, `transform` is also a good alternative, but so could other pieces of code such as using `data.table`. I chose to use `dplyr` to do a lot of my data munging, and not `transform`, `tapply`, `by`, etc. If you think a particular approach is preferable to using `dplyr`, just post a competing answer. – Paul Hiemstra Jan 26 '16 at 08:59
  • 1
    I just think that introducing unnecessary dependencies in order to solve a simple task is simply wrong practice (I guess you disagree). Of course if it comes to "by" operations, I agree that `dplyr` or `data.table` could add some value. We discussed this [on Meta](http://meta.stackoverflow.com/questions/311771/when-is-it-ok-to-post-an-answer-that-requires-a-new-r-package/) not long ago. – David Arenburg Jan 26 '16 at 09:07
  • I agree with the answer of Bill the Lizard in your link to Meta. This is a well established package coming from a reputable source who has written just about every popular R package there is. In addition, it is literally 2 seconds of work to install it. It is my personal preference not to use the base R solutions, and this is what I would do when faced with this problem. Again, if you have a different preference than me, please feel free to post a competing answer. – Paul Hiemstra Jan 26 '16 at 09:17