4

I have a data frame of ecological data where some entries are lower than what is in chemistry called LOQ (limit of quantificantion). These measurements are reported as "less than LOQ". What I want to do is to change these values to half of the LOQ. I could probably find code to remove the "<", but then I wouldn't know which of the entries to divide by 2.

#creating df 
x1 <- c(1,2,"<1")
x2 <- c(3,"<4",3)
x3 <- c(1,2,3)
df <- data.frame(x1,x2,x3)
df

x1 x2 x3
1  1  3  1
2  2 <4  2
3 <1  3  3

I want the results to be as:

##### result #######
x1 <- c(1,2,0.5)
x2 <- c(3,2,3)
x3 <- c(1,2,3)
result <- data.frame(x1,x2,x3)

   x1 x2 x3
1 1.0  3  1
2 2.0  2  2
3 0.5  3  3

So that, basically, the < sign is ignored and the remaining values are divided by 2. Any ideas on how to do this?

Yung Gud
  • 67
  • 6

5 Answers5

5

Use the fact that a matrix object can be referenced in either 1 or 2 dims.

m <- as.matrix(df) 
isLT <- function(t) substr(t,1,1) == '<' 
islt <- which(isLT(m)) 
delLT <- function(x) substr(x,2,length(x)) 
m[islt] <- delLT(m[islt]) 
mode(m) <- 'numeric'
m[islt] <- m[islt] / 2
1

Using base.


    x1 <- c(1,2,"<1")
    x2 <- c(3,"<4",3)
    x3 <- c(1,2,3)
    df <- data.frame(x1,x2,x3, stringsAsFactors = F)  # Important stringAsFactors

    extract_n_divide <- function(x) {
      # In case the element of a column contains "<"
      extract_number <- strtoi(sub("<", "", x))
      ifelse(grepl("^<", x), extract_number/2, x)
    }

    as.data.frame(lapply(df, extract_n_divide))

fran496
  • 86
  • 1
  • 4
0

Another solution using tidyverse:

library(tidyverse)

x1 <- c(1,2,"<1")
x2 <- c(3,"<4",3)
x3 <- c(1,2,3)
df <- data.frame(x1,x2,x3)


mutate_LOQ <- function(x){
  x <- as.character(x)
  case_when(
    substr(x, 1, 1) == '<' ~ as.numeric(substr(x, 2, length(x)))/2,
    TRUE ~ as.numeric(x)
  )
}

df %>% 
  mutate_all(mutate_LOQ)

Regards Paweł

Pawel Stradowski
  • 807
  • 7
  • 13
  • Thanks! used this and it works, my df is much larger and with different data types and NA:s so I just updated the last line to apply the function to a subset of the columns. – Yung Gud Aug 19 '19 at 15:56
  • I suspected, you have more columns, but for porotyping mutate_all just works :-) For your final solution, you probably need mutate_if or just explicitely mutate needed columns. – Pawel Stradowski Aug 19 '19 at 17:34
0

Here is a one liner :

df[] <- lapply(df, function(x) sapply(parse(text = sub("^<(.*)","\\1/2", x)), eval))
df
#>    x1 x2 x3
#> 1 1.0  3  1
#> 2 2.0  2  2
#> 3 0.5  3  3

And a more verbose but possibly more efficient solution :

mat   <- as.matrix(df)
ind   <- startsWith(mat, "<")
mat   <- sub("^<","",mat)
mode(mat) <- "numeric"
mat[ind] <- mat[ind]/2
df <- as.data.frame(mat)
df
#>    x1 x2 x3
#> 1 1.0  3  1
#> 2 2.0  2  2
#> 3 0.5  3  3
moodymudskipper
  • 46,417
  • 11
  • 121
  • 167
-1

This would be a tidyverse solution to your problem:

library(tidyverse)
x1 <- c(1,2,"<1")
x2 <- c(3,"<4",3)
x3 <- c(1,2,3)
df <- tibble(x1,x2,x3)

vec_loq <- function(vec){
  s <- str_detect(vec, "<|>")
  vec[s] <- vec[s] %>% 
    str_remove("<|>") %>% 
    as.numeric() %>% 
    {. / 2}
  as.numeric(vec)
}

map_dfc(df, vec_loq)
shs
  • 3,683
  • 1
  • 6
  • 34
  • 2
    Beware of mixing data types, the vectors are strings, replacing parts of them will not change the type (meaning, the resulting column vectors will still be strings, not numbers). – Konrad Rudolph Aug 19 '19 at 13:29
  • You're right. It is better to convert the variables to numeric. I have edited the code accordingly – shs Aug 19 '19 at 13:34
  • Thanks, this works well for the example df but got the error "object cannot be coerced to type 'double'" So I went with Pavel's solution below. – Yung Gud Aug 19 '19 at 15:58
  • That's interesting. It works on my device. Did it fail on the example or on your own data? I tried to modify the example data to replicate your error, but didn't manage to. The only things that I could get was `NA`s when I introduced other characters or a list column. Do you know what kind of variable it was that it failed on? – shs Aug 19 '19 at 16:14