0

I'm trying to find the largest number of people who did not survive in a dataframe that I am working on. I used a for loop to iterate through the rows but I'm having an issue. It doesn't seem like my if condition is working. It is saying that the largest number is 89 but it is actually 670.

most_lost <- 0
    for (i in 1:dim(Titanic)[1]) {
      if (Titanic$Survived[i] == "No")  {
        if (Titanic$Freq[i] > most_lost) {
          most_lost <- Titanic$Freq[i]
        }
        print(most_lost)
      }
    }

This is the output of the printed most_lost

[1] 0
[1] 0
[1] "35"
[1] "35"
[1] "35"
[1] "35"
[1] "35"
[1] "35"
[1] "35"
[1] "35"
[1] "387"
[1] "670"
[1] "670"
[1] "670"
[1] "89"
[1] "89"

Here is the table I'm working with

enter image description here

Frank
  • 66,179
  • 8
  • 96
  • 180
Anonymous
  • 17
  • 1
  • 7
  • 1
    Fyi, if you are using the built-in data set `DF = data.frame(Titanic)`, it would be helpful answerers to say so... I guess that's not the case (with your "35"), so you should/could provide code to reproduce your data. For guidance on that: https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example/28481250#28481250 – Frank Oct 15 '18 at 21:17

2 Answers2

3

Could you please check the data formats in your table, e.g., is Freq really numeric? With below example data your code works for me - see below code. As a side note, it would be better if you would not post your data as a figure, use, e.g., dput(data) instead and post its output, this makes it easier for others to import your data and check its structure. You might edit your question accordingly.

In any case, I would like to highlight, that for the task you describe you should not use a loop but simply subset your table, since looping will be unacceptably slow for such tasks with larger data sets. I have provided an example at the end of below code.

Titanic = as.data.frame(cbind(Survived = rep("No", 8), Freq = c(1,2,5,0,2,3,1,1)), stringsAsFactors = F)
#   Survived Freq
# 1       No    1
# 2       No    2
# 3       No    5
# 4       No    1
# 5       No    2
# 6       No    3
# 7       No    1
# 8       No    1
most_lost <- 0
for (i in 1:dim(Titanic)[1]) {
  if (Titanic$Survived[i] == "No")  {
    if (Titanic$Freq[i] > most_lost) {
      most_lost <- Titanic$Freq[i]
    }
    print(most_lost)
  }
}
# [1] "1"
# [1] "2"
# [1] "5"
# [1] "5"
# [1] "5"
# [1] "5"
# [1] "5"
# [1] "5"

max(Titanic[Titanic$Survived == "No", "Freq"])
# [1] "5"
Manuel Bickel
  • 2,156
  • 2
  • 11
  • 22
  • 1
    Re not using a loop, if they want the iterative "max so far", there's also `cummax(Titanic$Freq)` – Frank Oct 15 '18 at 21:20
  • 1
    You're right, Freq was actually strings. My problem was fixed after converting the Freq column to numeric. Subsetting worked too, thanks! – Anonymous Oct 15 '18 at 21:24
0

If I'm understanding correctly, you don't need a for loop.

max(Titanic$Freq[Titanic$Survived == "No"])

This line is subsetting the Freq column by rows where the Survived column is "No" and then finding the max value of the subsetted Freq column.