-1

I have a vector of values and a dataframe which I can find each item of a vector in a specific column of dataframe with the following command:

lapply(l, function(x) df[which(df$col1==x),col2])

How can I get NA for values which are not available in my dataframe?

For example:

df:   col1  col2
      1     a
      1     b
      2     c

l=c(1,3)

output:  col1   col2
         1      a,b
         3      NA
David Arenburg
  • 91,361
  • 17
  • 137
  • 196
Zaynab
  • 233
  • 3
  • 16
  • 2
    Not much clear to me. Post some inputs and the desired output. – nicola May 09 '18 at 07:20
  • Please provide example data for your `l` and `df` objects as well as the expected output. – LAP May 09 '18 at 07:20
  • when I use unlist to get the output of this function, all charater(0) values are removed. I want to get NA after using unlist for these values. – Zaynab May 09 '18 at 07:22
  • Please provide [reproducible-example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example). – zx8754 May 09 '18 at 07:32
  • example is provided – Zaynab May 09 '18 at 07:42
  • 1
    You are probably coming from Python. a `list` in Python and in R are different things. In your case you have a numeric vector. Also, `list` is a bad way to call it – David Arenburg May 09 '18 at 08:00

3 Answers3

3

Using data.table you could achieve this efficiently by running a binary join to l (your vector)

library(data.table)
setDT(df)[.(l), # join between `df` & `l`
          on = .(col1), # using `col1`
          .(col2 = toString(col2)), # paste the values in `col2` (you can add `unique`)
          by = .EACHI] # do this per each value in `l`
#    col1 col2
# 1:    1 a, b
# 2:    3   NA
David Arenburg
  • 91,361
  • 17
  • 137
  • 196
1

DATA:

df <- structure(list(col1 = c(1L, 1L, 2L), col2 = c("a", "b", "c")), .Names = c("col1","col2"), class = "data.frame", row.names = c(NA, -3L))
l <- c(1, 3)

CODE:

library(magrittr)
lapply(l, function(x){
    res<-df[[2]][df[[1]]==x] %>% paste(collapse=",")
    if(res=="") res = NA
    return(cbind(x,res))
    }) %>% do.call(rbind,.)

Result:

     x   res  
[1,] "1" "a,b"
[2,] "3" NA  
Andre Elrico
  • 10,956
  • 6
  • 50
  • 69
0

Function which gives TRUE if sth is NOT integer(0), character(0), etc. (they have in common that their length is zero):

non.zero.vec <- function(x) length(x) > 0

Any vector with such zero-length-value elements can be converted to NA using

zero2na <- function(vec) sapply(vec, function(x) ifelse(non.zero.vec(x), x, NA))

## e.g.
zero2na(c(1, 2, integer(0)) ## [1] 1 2 NA

Finally, this function does exactly what you want:

lookup <- function(df, key.col, val.col, keys) {
  idxs <- lapply(keys, function(x) which(df[, key.col] == x))
  lookups <- lapply(idxs, function(vec) if(length(vec) > 0) {df[vec , val.col]} else {NA})
  lookupstrings <- unlist(lapply(lookups, 
    function(v) suppressWarnings(if(is.na(v)) {"NA"} else {paste(v, collapse = ", ")})))
  res.df <- data.frame(unlist(keys), lookupstrings)
  colnames(res.df) <- c(key.col, val.col)
  res.df
}

df <- data.frame(col1 = c(1,1,2), col2 = c("a", "b", "c"))
lookup(df, "col1", "col2", c(1, 2, 3))

## output:

  col1 col2
1    1 a, b
2    2    c
3    3   NA
Gwang-Jin Kim
  • 9,303
  • 17
  • 30