6

What is the most efficient algorithm to evaluate a RPN notation in R?

Here is the question: suppose we have

c("4", "13", "5", "/", "+") -> (4 + (13 / 5)) -> 6

How to write a general purpose function that evaluates any input RPN ?

Does R have stack data structure ?

Thanks for your hints

Tripp Kinetics
  • 5,178
  • 2
  • 23
  • 37
Sam
  • 4,357
  • 6
  • 36
  • 60
  • 1
    Coincidentally, [I was playing around with this idea](https://gist.github.com/leeper/7773ed8a8b4bd9b9fed6) a few days ago. Maybe something in that gist will be useful to you. – Thomas Mar 03 '15 at 19:29
  • You could almost certainly cheat and hack this together by constructing `call` objects. Hadley Wickham's book has a chapter on that process – shadowtalker Mar 04 '15 at 00:23
  • @ssdecontrol, why is that cheating? – BrodieG Mar 04 '15 at 01:12

1 Answers1

11

To my knowledge there isn't a dedicated stack structure that you can push/pop etc., but you can easily use lists to achieve the same effect. Here we use the same list to hold the input RPN string and to act as the stack:

rpn <- function(v) {
  l <- lapply(v, function(x) if(grepl("^\\d+$", x)) as.numeric(x) else as.name(x))
  i <- 1
  while(length(l) >= i) {
    if(!is.numeric(l[[i]])) {
      l[[i - 2]] <- as.call(l[c(i, i - 2, i - 1)])
      l[i:(i - 1)] <- NULL
      i <- i - 1
    } else i <- i + 1
  }
  l[[1]]      
}

Let's test:

v <- c("4", "13", "5", "/", "+")
rpn(v)              # un-evaluated reparsed expression
# 4 + 13/5
eval(rpn(v))        # which you can evaluate
# [1] 6.6

Something more challenging:

v <- c("4", "13", "+", "5", "/", "8", "-", "10", "3", "+", "*")
rpn(v)
# ((4 + 13)/5 - 8) * (10 + 3)
eval(rpn(v))
# [1] -59.8

A breakdown of the logic:

  • make a list with numbers and symbols representing the operators
  • walk down list left to right
    • if hit a number, just move pointer to next value (the stuff to the left of the pointer is our stack, so we are using part of our input list as the stack!)
    • if hit a function, combine the two items to the left of the pointer (i.e. rightmost items in stack) into one call on the stack, and reset pointer

That's it. We take advantage that R stores calls as nested lists and that +, -, etc. are just functions in R so this works very naturally.

Assumptions:

  • functions are assumed binary (i.e. no unary - or +, among other things)
  • the user is only inputing valid functions
  • the numbers are integers (this will extend to numerics if you tweak the regular expression)
  • the RPN string needs to make sense (i.e. c("3", "+", "3")) will fail.
BrodieG
  • 51,669
  • 9
  • 93
  • 146
  • Would `match.fun` make sense here? I've never used it explicitly, but it seems to be the machinery behind `*apply` – shadowtalker Mar 04 '15 at 12:16
  • @ssdecontrol, ironically no, you specifically don't want to use `match.fun`. Compare `as.call(list(match.fun("+"), 1, 2))` to `as.call(list(as.name("+"), 1, 2))`. In the latter case R recognizes the `+` as an operator, whereas in the former it doesn't. Both cases work, but using `as.name` also produces the arithmetic expression in a recognizable form. – BrodieG Mar 04 '15 at 13:41
  • @BrodieG The `rstack` pkg has a stack structure, as does `rstackdeque`. – Richie Cotton Sep 15 '16 at 17:49