How to remove empty columns in R?

Question

I have a CSV file

Identity,Number,Data,Result,Add,,,,,,,,,,,,
1,,,,4,55,,92,,,,,,,,,62,
3,,,,7,43,,12,,,,,,,,,74,
7,,,,3,58,,52,,,,,,,,,64,
0,,,,6,10,,22,,,,,,,,,96,
3,,,,8,13,,92,,,,,,,,,22,

Desired Output

Identity,Number,Data,Result,Add
1,4,55,92,62
3,7,43,12,74
7,3,58,52,64
0,6,10,22,96
3,8,13,92,22

americo · Answer 1 · 2014-02-28T21:58:09.347

15

After you've imported your data (using the method the other answerer suggested) run this command, substituting mydf for whatever you decide to call your data frame:

#Remove empty columns
mydf <- Filter(function(x)!all(is.na(x)), mydf)

edited Feb 28 '14 at 21:58

answered Feb 28 '14 at 21:26

americo

1,013
8
17

For some reasons, this does not work with the example from this post https://www.codingprof.com/3-easy-ways-to-remove-empty-columns-in-r – Tung Jun 30 '22 at 14:35

score 3 · Answer 2 · answered Nov 24 '14 at 05:42

3

Convert the empty cell to NA and then remove the NAs

mydata[mydata == ""] <- NA
mydata<- mydata[,colSums(is.na(mydata)) < <a desired filter value>]

answered Nov 24 '14 at 05:42

Pramit

1,373
1
18
27

score 2 · Answer 3 · answered Oct 23 '17 at 17:41

2

A more complex removal logic is like this:

mydf <- Filter(function(x)!all(is.na(x) || is.null(x) || x == "" || x == 0), mydf)

answered Oct 23 '17 at 17:41

R Keene

71
2

score 2 · Answer 4 · answered Nov 29 '18 at 21:39

Try remove_empty from janitor

Using the input from Thomas's post:

> mydf

  V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11 V12 V13 V14 V15 V16 V17 V18
1  1 NA NA NA  4 55 NA 92 NA  NA  NA  NA  NA  NA  NA  NA  62  NA
2  3 NA NA NA  7 43 NA 12 NA  NA  NA  NA  NA  NA  NA  NA  74  NA
3  7 NA NA NA  3 58 NA 52 NA  NA  NA  NA  NA  NA  NA  NA  64  NA
4  0 NA NA NA  6 10 NA 22 NA  NA  NA  NA  NA  NA  NA  NA  96  NA
5  3 NA NA NA  8 13 NA 92 NA  NA  NA  NA  NA  NA  NA  NA  22  NA

> mydf %>% remove_empty("cols")

1  1  4 55 92  62
2  3  7 43 12  74
3  7  3 58 52  64
4  0  6 10 22  96
5  3  8 13 92  22

score 1 · Answer 5 · answered Apr 25 '16 at 01:34

1

If we have limited columns like you have in this case, below would be neat workaround.

myDF<-myDF[,c(1,2,3,4,5)]

answered Apr 25 '16 at 01:34

Chirag

1,478
16
20

score 0 · Answer 6 · answered Feb 28 '14 at 20:18

Load the data file, excluding the header row:

mydf <- 
read.csv(text='Identity,Number,Data,Result,Add
1,,,,4,55,,92,,,,,,,,,62,
3,,,,7,43,,12,,,,,,,,,74,
7,,,,3,58,,52,,,,,,,,,64,
0,,,,6,10,,22,,,,,,,,,96,
3,,,,8,13,,92,,,,,,,,,22,', header=FALSE, skip=1)

Then load just the header row:

n <- read.csv(text='Identity,Number,Data,Result,Add', header=TRUE)

Then use the names from n to assign names to the full data, excluding NA columns:

setNames(mydf[,sapply(mydf, function(x) all(!is.na(x)))],names(n))

Note, if you have this saved as a file, the full code should be:

mydf <- read.csv('file.csv', header=FALSE, skip=1)
n <- read.csv('file.csv', header=TRUE, nrows=1)
setNames(mydf[,sapply(mydf, function(x) all(!is.na(x)))],names(n))

I have edited my question, now there are equal number of commas in the header as well, so basically it's being read as 16 cols instead of 5, how to solve the problem, and thanks for the help. — user3188390, Feb 28 '14 at 20:38

Clarius · Answer 7 · 2017-03-19T22:21:58.463

0

How about you keep only those columns which name is not blank?

> mydf <- mydf[,names(mydf)!=""]

edited Mar 19 '17 at 22:21

answered Mar 19 '17 at 19:21

Clarius

1,183
10
10

score 0 · Answer 8 · answered May 15 '23 at 19:18

Here's a fancy function I sometimes use for fun. Note that it is not robust - it searches the complete environment for a .data data set, which dplyr puts there if running select() so it depends on their internal framework (which might change):

empty_cols <- function(x) {
  for (frm in sys.frames()) {
    if (is.data.frame(frm$.data)) {
      df <- frm$.data
      return(which(sapply(df, function(col) all(is.na(col)))))
    }
  }
  # found nothing, return NULL to return no columns:
  NULL
}

Now select using:

library(dplyr)
my_df %>% select(!empty_cols())

Would've been great if we could just use dplyr::pick(everything()) in this function, but they have limited the use to mutate(), filter() and group_by()...

How to remove empty columns in R?

8 Answers8

Linked