4

I would like know whether it is permitted to use an equals (=) sign in the recodes parameter of the recode function in the car package?

For instance, the following fails:

library(car)
n <- c(0, 10, 20, 21, 60, 70)
r <- recode(n, " 0:20 = '<= 20' ; 20:70 = '> 20' ")
# Error in recode(n, " 0:20 = '<= 20' ; 20:70 = '> 20' ") : 
# in recode term:  0:20 = '<= 20' 
# message: Error in parse(text = strsplit(term, "=")[[1]][2]) : 
#  <text>:1:2: unexpected INCOMPLETE_STRING
# 1:  '<
# ^

Removing the = from <= 20 works fine:

r <- recode(n, " 0:20 = '< 20' ; 20:70 = '> 20' ")
table(r) 
r
# < 20 > 20 
# 3    3 

Given I'm using recode in a context where I'm taking the recodes argument as user input, I'm hoping any solution does not require explicit escape characters being necessary as this would be burdensome.

I'm running R version 3.2.3 (2015-12-10) -- "Wooden Christmas-Tree"

Ben Bolker
  • 211,554
  • 25
  • 370
  • 453

3 Answers3

2

car::recode is always going to be a pain, as it parses the recode string (which will break if it contains a "spurious" equals sign anywhere).

For your particular application cut works well:

n <- c(0, 10, 20, 21, 60, 70)
cut(n,breaks=c(-1,20,Inf),labels=c("<= 20", ">20"))

plyr::revalue is useful for one-to-one mapping (also see plyr::mapvalues):

x <- factor(c("a","b","c"))
revalue(x,c("a"=">= 20"))

I don't know of a good off-the-shelf many-to-one solution:

x <- factor(letters[1:8])
oldvals <- list(c("a","b","c"),c("d","e"),c("f","g","h"))
newvals <- c("new1","new2","new3")
for (i in seq_along(oldvals)) {
    m <- which(levels(x) %in% oldvals[[i]])
    if (length(m)>0) 
       levels(x)[m] <- rep(newvals[i],length(m))
}

This might get a bit ugly if the new/old codes overlap in some pathological way ...

Ben Bolker
  • 211,554
  • 25
  • 370
  • 453
2

Given I'm using recode in a context where I'm taking the recodes argument as user input

I'm not sure what that means, but this is fairly end user-friendly:

map_em = function(
  n, 
  recs = readline(prompt = "enter map like key = value, key2 = value2: \n")
){
    m = eval(parse(text = sprintf("list(%s)", recs)))
    s = stack(m)
    s$ind[ match(n, s$value) ]
}

# usage example
map_em(n)
# enter map like key = value, key2 = value2: 
'<= 20' = 0:20, '> 20' = 21:70
# [1] <= 20 <= 20 <= 20 > 20  > 20  > 20 
# Levels: <= 20 > 20

Because it uses match, your user can enter overlapping values (like the OP did, writing 0:20 and 20:70) and it will simply take the first match.


Similarly, the user might pass the mapping directly in the function call:

map_em2 = function(n, ...){
    m = list(...)
    s = stack(m)
    s$ind[ match(n, s$value) ]
}

# usage example    
map_em2(n, '<= 20' = 0:20, '> 20' = 21:70)
# [1] <= 20 <= 20 <= 20 > 20  > 20  > 20 
# Levels: <= 20 > 20
Frank
  • 66,179
  • 8
  • 96
  • 180
1

I had the same issue and didn't find any solution. Here is my clumsy solution, using gsub

r <- recode(n, " 0:20 = '< 20' ; 20:70 = '> 20' ")
r <- gsub("< 20", "<= 20", r)
Ben Bolker
  • 211,554
  • 25
  • 370
  • 453
xjf
  • 141
  • 6