0

I am running below code, its working but not showing me output

for (name in tita$name){
  if (tita$sex == 'female' && tita$embarked == 'S' && tita$age > 33.00)
  {
    print (name)
  }
}

It's just showing me ****** in R studio, though when I check dataset, it has data which have female having age greater than 33 and embarked from S, but this statement is not showing me result. But when I change the value from 33 to 28 the same code shows me the result. Why is that.

I am using the following dataset:

https://biostat.app.vumc.org/wiki/pub/Main/DataSets/titanic3.csv

AHF
  • 1,070
  • 2
  • 15
  • 47
  • I think you want `&` here and not `&&`, if you want to be checking each row. – Jon Spring Mar 17 '21 at 23:02
  • If I use `&` instead of `&&` then I get this error `the condition has length > 1 and only the first element will be used` and also why it works then for age >25? – AHF Mar 17 '21 at 23:05
  • Also it looks like you're evaluating entire vectors. I agree with @JonSpring that you want to use `&`, but only the first element of each logical vector will be used. – LMc Mar 17 '21 at 23:05
  • Yes I wanted to traverse through all the rows – AHF Mar 17 '21 at 23:09
  • More info on `&` vs. `&&`: https://stackoverflow.com/a/6559049/6851825 – Jon Spring Mar 17 '21 at 23:11
  • `subset` or `dplyr::filter` would be a better approach than a loop. For example: `library(dplyr); filter(titanic3, sex == "female", age > 33.00, embarked == "S")` – neilfws Mar 17 '21 at 23:15

1 Answers1

1

I think you're mixing loops and vectorization where you shouldn't. As I mentioned in the comments your conditions are vectorized, but it looks like you're trying to evaluate each element in a loop.

You should do either:

# loop through elements
for (i in seq_along(tita$name)){
  if (tita$sex[i] == 'female' & tita$embarked[i] == 'S' & tita$age[i] > 33.00){
    print(tita$name[i])
  }
}

OR use vectorization (this will be faster and is recommended):

conditions <- tita$sex == 'female' & tita$embarked == 'S' & tita$age > 33.00
names <- tita$name[conditions]

Here conditions is a TRUE and FALSE logical vector -- TRUE where all the conditions are met. We can use the to subset in R. For more information on what I mean by vectorization please see this link.

LMc
  • 12,577
  • 3
  • 31
  • 43
  • and if we need to deal with 'NA' in the dataset then we use na.omit with in if condition ? – AHF Mar 17 '21 at 23:14
  • No, you need something that will return `TRUE` or `FALSE`. So you should use `is.na`. Otherwise, you can use `na.omit` before you start you looping or vectorization operations to subset your dataset. – LMc Mar 17 '21 at 23:17