1

Why nrow/NROW is not giving results without decimal expression in R?

I tried calculating the number of rows in R and did not get results with nrow/NROW.

I tried the summary option but with that, I have to provide decimal expression.

I rounded the column using df$WAGE_RATE <- round(df$WAGE_RATE, digit = 0)

> class(h1b$WAGE_RATE)
"numeric"


> nrow(df$WAGE_RATE < '1000000')
NULL


> nrow(df$WAGE_RATE < '1000000.00')
NULL


> summarise(df, ct = sum(as.numeric(WAGE_RATE < '100000')))
 A tibble: 1 x 1
     ct
  <dbl>
1     0


> summarise(df, ct = sum(as.numeric(WAGE_RATE < '100000.00')))
 A tibble: 1 x 1
     ct
  <dbl>
1  9052
M--
  • 25,431
  • 8
  • 61
  • 93
Gourav
  • 11
  • 1

1 Answers1

3

First of all, if you are dealing with a numeric column, you should not compare to a character (i.e. '1000000').

Second, When you do a comparison, you'll get a vector with TRUE/FALSE. Look at the example below:

mtcars$mpg < 22
 # [1]  TRUE  TRUE FALSE  TRUE  TRUE  TRUE  TRUE FALSE FALSE  TRUE  TRUE
 # [12] TRUE  TRUE  TRUE  TRUE  TRUE  TRUE FALSE FALSE FALSE  TRUE  TRUE  
 # [23] TRUE  TRUE  TRUE FALSE FALSE FALSE  TRUE  TRUE  TRUE  TRUE

As TRUE is equivalent of 1, you can get the sum and it tells you number of rows which are satisfying that condition (If you count number of TRUE entries, you'll see 23 of them).

sum(mtcars$mpg < 22)
 # [1] 23

But if you want to use nrow, you need to provide it with a data.frame. For that, you need to subset your data based on that condition. You can read more about that here: Extract a subset of a dataframe based on a condition involving a field. I simply provide the solution:

nrow(mtcars[mtcars$mpg < 22,])
 # [1] 23
M--
  • 25,431
  • 8
  • 61
  • 93