1

I am confused about the conditional if statement in R.

What I want:

Let's say there are two variables; Data$Export & Data$Sales,

Only if both Data$Export & Data$Sales for a row has the value '0', I want the row to be removed from the dataset. Or, as I thought, to set any variable in the row to NA, which will consequently be removed with: "Data <- na.omit(Data) " anyway.

Therefore, I thought of the following construction:

for (i in 1:nrow(Data)) { if ( (Data$Sales[i] == 0) &(Data$Export[i] == 0 ) ) {Data$Sales [i] <- NA }}
Data <- na.omit(Data)

However, this does not work, the error code yields: missing value where TRUE/FALSE needed

Thank you in advance for any help I may receive.

shiny
  • 3,380
  • 9
  • 42
  • 79
  • 1
    Please, try to provide a reproducible example https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example – shiny Dec 23 '17 at 23:10
  • Please, consult `help("if")`. The description of the `cond` parameter in `if(cond) expr` says: *A length-one logical vector that is not NA. Conditions of length greater than one are currently accepted with a warning, but only the first element is used.* `if` is one of the basic control-flow constructs but not meant to work element-wise on a vector. There are vectorized equivalents as shown [here](https://stackoverflow.com/a/47956966/3817004) or `ifelse()`. – Uwe Dec 24 '17 at 09:06

2 Answers2

1
Data2 <- Data[Data$Export !=0 | Data$Sales != 0,]

Or to to set NAs

 Data[Data$Export !=0 | Data$Sales != 0,] <- NA
moodymudskipper
  • 46,417
  • 11
  • 121
  • 167
0

I think you don't need to use conditional if statement to do so. Using diamonds data.frame, you can remove rows that have 0 in both variables y and z as below. Thanks to @Moody, @Uwe and the OP comments, it should be | instead of &.

library(dplyr)
diamonds1 <-  diamonds %>% 
  dplyr::filter(y!=0 | z!=0)

the same can be applied to your dataframe

data1 <- data %>% 
  dplyr::filter(Sales!=0 | Export!=0)
shiny
  • 3,380
  • 9
  • 42
  • 79
  • You need | not & here to do what OP wants. And I would suggest you use subset rather than filter so you don't need to load dplyr – moodymudskipper Dec 23 '17 at 23:48
  • @Moody_Mudskipper Thanks. The OP mentioned "Only if **both** Data$Export & Data$Sales for a row has the value '0'". Also, the OP used **&** in the code in the question. So, I thought & should be used. – shiny Dec 23 '17 at 23:55
  • 1
    So in the case there is only one zero the row should be kept, in your case it wouldn't be, OP used & with == because he was describing the rows to remove. To describe the rows to keep you need | with !=. But your answer was accepted so maybe it's what OP wanted after all :). – moodymudskipper Dec 24 '17 at 08:40
  • 1
    I'm seconding @Moody_Mudskipper. According to [De Morgan's laws](https://en.wikipedia.org/wiki/De_Morgan%27s_laws), `!(A & B) = (!A | !B)`. – Uwe Dec 24 '17 at 08:57
  • 1
    Hello! I ended up using " | " instead of & to obtain the desired result. Thank you all very much for the help. – Anouk van Rooijen Dec 25 '17 at 01:07
  • Thanks Anouk. I updated the answer thanks to your comment, @Uwe and Moody_Mudskipper – shiny Dec 25 '17 at 22:56