-3

I have a dataset that looks the following:

ID1      x1   x2
200      10   NA
200      12   NA
200      13   NA
200      NA   17
200      NA   16
200      NA   19

I would like to remove the part of the column that has NA, then I will have an output like this

ID1   x1   x2
200   10   17
200   12   16
200   13    19

How can I do this in R?

research111
  • 347
  • 5
  • 18
  • Values would always have to shift up, if ID1 is the same. so the values are in the correct columns, but I would like need to replace tha NA in x2 by the values below if ID1 is the same. I'm not sure if that's possible – research111 Feb 12 '16 at 17:50
  • Should the 2 in column x1 be a 13? – talat Feb 12 '16 at 18:09
  • 2
    In case the 2 is there by mistake you could try `library(dplyr); df %>% group_by(ID1) %>% mutate_each(funs(.[order(is.na(.), na.last = FALSE)])) %>% ungroup() %>% filter(rowSums(!is.na(.[-1])) > 0)` – talat Feb 12 '16 at 18:15
  • thanks, this worked! – research111 Feb 12 '16 at 20:09
  • https://stackoverflow.com/questions/4862178/remove-rows-with-nas-missing-values-in-data-frame here is a better linked example in my opinion. It uses `complete.cases`, which is pretty start forward, and better yet, easy to remember – Dave Gruenewald Jul 28 '17 at 15:10
  • @DaveGruenewald The example you have linked to deals with *removing complete rows* from the dataframe. The Q asks to remove `NAs` *columnwise*, i.e., to shift non-NA values upwards. – Uwe Jul 31 '17 at 14:22
  • *If* the number of `NA`s is equal in both columns *and* all `NA`s are assembled at start or end of each column, then `na.trim()` from the `zoo` package could be used: `library(data.table); data.table(DF)[, lapply(.SD, zoo::na.trim), by = ID1]` – Uwe Jul 31 '17 at 14:25

2 Answers2

3

We can do this with data.table. But, we have to understand that the OP provided a minimal example where the NA elements are the same in both the columns 'x1' and 'x2'. If it is not, this will not work.

We convert the 'data.frame' to 'data.table' (setDT(df1)), grouped by 'ID', we loop over the other columns and remove the NA elements with na.omit.

library(data.table)
setDT(df)[, lapply(.SD, na.omit) , by = ID1]
#   ID1 x1 x2
#1: 200 10 17
#2: 200 12 16
#3: 200 13 19
akrun
  • 874,273
  • 37
  • 540
  • 662
1

Base R:

# remove NAs and leave in list of vectors
xclean <- apply(x[2:3],MARGIN = 2,na.omit)

# chance to write code to handle different length vectors

# put in dataframe
data.frame(ID1 = 200,xclean)
ARobertson
  • 2,857
  • 18
  • 24