0

I have the following data which only has the last column on some occasions, and I'm trying to write an IF statement on this condition. When the column does not appear or when the column has an NA value I want this to return TRUE. As seen in my example below this works in position [[1]], but returns logical(0) for position [[2]]. Does anyone know why this is and how i can also get the statement to read TRUE or FALSE in position 2?

example data

df
[[1]]
      Score  didbreak   break
1        6    FALSE     <NA>
2        6    FALSE      9
3        5    FALSE     <NA>

[[2]]
      Score  didbreak   
1        3    FALSE     
2        4    FALSE      

[[3]]
      Score  didbreak   
1        9    FALSE     
2        8    FALSE      
3        8    FALSE     

if statement used (part thats an issue)

if(nrow(df[[xx]])==3 & !is.na(df[[xx]]$Score[3]) & (is.na(df[[xx]]$break[3]) | is.null(df[[xx]]$break[3]))) { }

So this works for the top position [[1]] as there is an NA value in row 3 of the data, but in position [[2]] it returns logical(0) as there is no break column. The is.null bit works fine however

(ignore the breif names etc ive edited it down here)

Konrad Rudolph
  • 530,221
  • 131
  • 937
  • 1,214
Joe
  • 795
  • 1
  • 11
  • You should generally never use `&` and `|` inside an `if` statement; this is always an error. See https://stackoverflow.com/a/22251946/1968. – Konrad Rudolph Mar 01 '23 at 12:36
  • Can you post that data in `dput` format? Please edit the question with the output of `dput(df)`. – Rui Barradas Mar 01 '23 at 12:41
  • What is your desired output? From your example 3 x TRUE? or list(c(TRUE, FALSE, TRUE), c(FALSE, FALSE), c(FALSE, FALSE, FALSE))? (or something else?) – Merijn van Tilborg Mar 01 '23 at 13:09
  • Your code is relying on indexing vectors beyond their dimension (`v[3]` from v of length 2). Is there a way to check the `nrow` of the dataframe? This might be a lot cleaner. – Arthur Mar 01 '23 at 13:26

1 Answers1

1

Following @Konrad Rudolph in the comments, you'll want to use &&/|| over &/| (What is the difference between || and | in R ?).

I.e.

lapply(example,
       \(df) { if (nrow(df) == 3 && !is.na(df$Score[3]) && (is.na(df$break1[3]) || is.null(df$break1[3]))) "Conditions satisfied" }
)

Output:

[[1]]
[1] "Conditions satisfied"

[[2]]
NULL

[[3]]
[1] "Conditions satisfied"

Data:

example <- list(data.frame(Score = c(6, 6, 5), didbreak = c(F, F, F), break1 = c(NA, 9, NA)),
                data.frame(Score = c(3, 4), didbreak = c(F, F)),
                data.frame(Score = c(9, 8, 8), didbreak = c(F, F, F)))

To avoid confusion with break() I've used break1 as the name.

harre
  • 7,081
  • 2
  • 16
  • 28