2

I have a data frame like below:

how do I remove na and use below value to go up?

Thanks

  id name.america name.europe name.asia
   1            a        <NA>      <NA>
   2         <NA>           b      <NA>
   3         <NA>        <NA>         c
   4            d        <NA>      <NA>

Change to:

  id name.america name.europe name.asia
   1            a           b         c 
   2            d           
  • 1
    how did your data end up in this format? It feels odd to take a value for id=2, and move it to id=1. Id's should be your identifier – Jonny Phelps Dec 05 '18 at 16:56

3 Answers3

3

We can loop through the columns and remove the NA, then make the lengths of the list elements same by appending NA at the end after getting the max length of the list element. Based on that, subset the 'id' column of the dataset and append with the output

lst <- lapply(df1[-1], na.omit)
lst1 <- lapply(lst, `length<-`, max(lengths(lst)))
out <- data.frame(lst1)
out1 <- cbind(id = df1$id[seq_len(nrow(out))], out)
out1
#  id name.america name.europe name.asia
#1  1            a           b         c
#2  2            d        <NA>      <NA>

If we need NA to be changed to blanks ("") - not recommended

out1[is.na(out1)] <- ""

data

df1 <- structure(list(id = 1:4, name.america = c("a", NA, NA, "d"), 
name.europe = c(NA, "b", NA, NA), name.asia = c(NA, NA, "c", 
NA)), class = "data.frame", row.names = c(NA, -4L))
akrun
  • 874,273
  • 37
  • 540
  • 662
2

tidyverse-based solution

require(tidyverse)

df1 %>% 
    gather(key = "name", value = "val", -id) %>% 
    na.omit() %>% 
    select(-id) %>%
    group_by(name) %>% 
    mutate(id = 1:n()) %>% 
    spread(key = name, value = val)

Results

# A tibble: 2 x 4
     id name.america name.asia name.europe
  <int> <chr>        <chr>     <chr>      
1     1 a            c         b          
2     2 d            NA        NA  

Notes

  • If desired you can re-order columns with select or that variable prior to transformation.
  • NAs are left as such. If desired, you can use tidyr::replace_na to insert some string or space. I would discourage you from doing that.

Data

Taken from @akrun's answer above.

df1 <- structure(
    list(
        id = 1:4,
        name.america = c("a", NA, NA, "d"),
        name.europe = c(NA, "b", NA, NA),
        name.asia = c(NA, NA, "c",
                      NA)
    ),
    class = "data.frame",
    row.names = c(NA, -4L)
)
Konrad
  • 17,740
  • 16
  • 106
  • 167
0
df1[, -1] <- lapply(df1[,-1], function(x) c(na.omit(x), rep("",length(x)-length(na.omit(x)))))
df1[1:max(colSums(!(df1[,-1]==""))),]

#  id name.america name.europe name.asia
#1  1            a           b         c
#2  2            d                      
Andre Elrico
  • 10,956
  • 6
  • 50
  • 69