What is the most efficient way to remove empty columns in a datatable in R

Question

I have some import issues with a file, that create an empty column at the end such as :

library(data.table)
library(tidyverse)
MWE <- data.table(var1=c(1,2),var2=c(3,4),var3=c(NA,NA))

Now I can remove it easily as I know the empty column is the last one :

MWE2 <- MWE[,c(length(MWE)):=NULL]

But I wondered how I would do if I just wanted to remove a random empty column without knowing its number. A quick search here and on the datatable page gave me a lot of examples on how to :

remove empty lines in datatables, through na.omit
remove empty columns in a dataframe, for instance here

But I did not find solutions to remove empty columns in datatables. What are the options and which is the fastest ?

Here is a relevant post with dplyr solutions: https://stackoverflow.com/questions/49374887/piping-the-removal-of-empty-columns-using-dplyr — LC-datascientist, Nov 09 '20 at 22:21

akrun · Accepted Answer · 2020-11-09T22:24:38.430

6

We could check if all values are NA in a column, get the column name and assign those to NULL

nm1 <- MWE[, names(which(sapply(.SD, function(x) all(is.na(x)))))] 
# or
# nm1 <- MWE[, names(which(!colSums(!is.na(.SD))))]
MWE[, (nm1) := NULL]

Or with Filter

MWE[, Filter(function(x) any(!is.na(x)), .SD)]

Or using select

library(dplyr)
MWE %>%
     select(where(~ any(!is.na(.))))

edited Nov 09 '20 at 22:24

answered Nov 09 '20 at 22:12

akrun

874,273
37
540
662

How do we handle the first version for caes where `nm1` is empty (no columns have all `NA`)? Do we just wrap `MWE[, (nm1) := NULL]` in a check for `length(nm1) > 0`? – Therkel Nov 11 '22 at 09:08
@Therkel if it returns nothing also it should work `dt1 <- as.data.table(head(iris)); > nm1 <- intersect('hello', names(dt1));dt1[, (nm1) := NULL]` – akrun Nov 11 '22 at 17:11

What is the most efficient way to remove empty columns in a datatable in R

1 Answers1