I have this problem I was hoping someone could help.
I have a very large data frame (close to 20000000 observations)in R with about 43 columns, in four of those columns I need to find whether there is more than one equal minimum value below 200, then if we have rows where more than one column has the same value that meets this criteria I need to flag that row to TRUE (in a a new flag column). Please note that those columns include NA values, and NA
s should not be used (when NA is present in the columns being compared, returns NA)
the goal is to look up the values in each row for columns a1 to a4 and find whether the minimum value that does not exceed 200, occurs in more than one column per row
for simplicity let's say that this is how my data data look like
head(mydata)
t1 a1 a2 a3 a4
34 NA NA NA NA
26 10 15 250 150
34 20 20 100 30
35 5 5 10 5
25 45 100 3 45
31 400 310 500 310
")
the goal is to look up the values in each row for columns a1 to a4 and find whether the minimum value that does not exceed 200, occurs in more than one column per row, if it does return true if not, false
the expected result will look like this
head(mydata)
t1 a1 a2 a3 a4 flag
34 NA NA NA NA NA
26 10 15 250 150 FALSE
34 20 20 100 30 TRUE
35 5 5 10 5 TRUE
25 45 100 3 45 FALSE
31 400 310 500 310 FALSE
")
Thank you in advance.