-1

In this example I need to drop all rows with NA values, I tried

drop <- is.na(df[,c(3,4,5)])

Error in df[, c(3, 4, 5)] : incorrect number of dimensions

My dataframe have 5 columns

I am not trying to select columns with column name

Also tried

df[complete.cases(df[ , 3:5]),]

Same error, incorrect number of dimensions

Youssef
  • 565
  • 1
  • 5
  • 21
  • 2
    Please provide a [minimal reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) so that we can understand what your error is, and how to provide some assistance to you. It's quite hard to comprehend what you are conveying without such example. – Adam Quek Jul 17 '22 at 12:14
  • 1
    Please check the `class(df)`. I suspect your `df` is a vector which doesn't have `dim` attribute i.e. `df <- c(1:10, NA)# > is.na(df[, c(3, 4, 5)])# Error in df[, c(3, 4, 5)] : incorrect number of dimensions` If it is the case, you need `is.na(df[3:5])` – akrun Jul 17 '22 at 17:14

1 Answers1

1

Dropping missing values from vectors

The errors indicate that your data are likely a vector, not a data.frame. Accordingly, there are no rows or columns (it has no dim) and so using [,] is throwing errors. To support this, below I create a vector, reproduce the errors, and demonstrate how to drop missing values from it.

# Create vector, show it's a vector
vec <- c(NA,1:4)
vec 
#> [1] NA  1  2  3  4
is.vector(vec)
#> [1] TRUE

# Reproduces your errors for both methods
is.na(vec[ ,2:3])
#> Error in vec[, 2:3]: incorrect number of dimensions

vec[complete.cases(vec[ , 2:3]), ]
#> Error in vec[, 2:3]: incorrect number of dimensions

# Remove missing values from the vector
vec[!is.na(vec)]
#> [1] 1 2 3 4
vec[complete.cases(vec)]
#> [1] 1 2 3 4

I'll additionally show you below how to check if your data object is a data.frame and how to omit rows with missing values in case it is.

Create data and check it's a data.frame

# Create an example data.frame
set.seed(123)
N <- 10

df <- data.frame(
  x1 = sample(c(NA_real_, 1, 2, 3), N, replace = T),
  x2 = sample(c(NA_real_, 1, 2, 3), N, replace = T),
  x3 = sample(c(NA_real_, 1, 2, 3), N, replace = T)
)

print(df)
#>    x1 x2 x3
#> 1   2  3 NA
#> 2   2  1  3
#> 3   2  1 NA
#> 4   1 NA NA
#> 5   2  1 NA
#> 6   1  2  2
#> 7   1  3  3
#> 8   1 NA  1
#> 9   2  2  2
#> 10 NA  2  1

# My hunch is that you are not using a data.frame. You can check as follows:
class(df)
#> [1] "data.frame"

Approaches to removing rows with missing values from data.frames

Your first approach returns logical values for whether a value is missing for the specified columns. You could then rowSum and drop them per below.

# Example: shows whether values are missing for second and third columns
miss <- is.na(df[ ,2:3])
print(miss)
#>          x2    x3
#>  [1,] FALSE  TRUE
#>  [2,] FALSE FALSE
#>  [3,] FALSE  TRUE
#>  [4,]  TRUE  TRUE
#>  [5,] FALSE  TRUE
#>  [6,] FALSE FALSE
#>  [7,] FALSE FALSE
#>  [8,]  TRUE FALSE
#>  [9,] FALSE FALSE
#> [10,] FALSE FALSE

# We can sum all of these values by row (`TRUE` = 1, `FALSE` = 0 in R) and keep only
# those rows that sum to 0 to remove missing values. Notice that the row names 
# retain the original numbering.
df[rowSums(miss) == 0, ]
#>    x1 x2 x3
#> 2   2  1  3
#> 6   1  2  2
#> 7   1  3  3
#> 9   2  2  2
#> 10 NA  2  1

Your second approach is to use complete.cases. This also works and produces the same result as the first approach.

miss_cases <- df[complete.cases(df[ ,2:3]), ]
miss_cases
#>    x1 x2 x3
#> 2   2  1  3
#> 6   1  2  2
#> 7   1  3  3
#> 9   2  2  2
#> 10 NA  2  1

A third approach is to use na.omit() however, it doesn't let you specify columns and you should just use complete.cases instead if you need to filter on specific columns.

na.omit(df)
#>   x1 x2 x3
#> 2  2  1  3
#> 6  1  2  2
#> 7  1  3  3
#> 9  2  2  2

A fourth approach is to use the tidyr package where the appeal is you can use column indices as well as unquoted column names. This also updates row names.

library(tidyr)
drop_na(df, 2:3)
#>   x1 x2 x3
#> 1  2  1  3
#> 2  1  2  2
#> 3  1  3  3
#> 4  2  2  2
#> 5 NA  2  1
socialscientist
  • 3,759
  • 5
  • 23
  • 58