Using "maybe ARefs" in higher-order functions

Question

I want to know how to wrap a function (or function definition) such that it becomes seemingly agnostic to whether the parameters passed to it are mutable or immutable -- but if any parameter it is given is mutable, it should dereference that parameter every time it is called, to get the current value.

I could write a function that requires each parameter to be mutable storage that it then dereferences each time it is called. But there's a performance hit (very, very small, I know!) to dereferencing mutable storage in Clojure. In my specific use case these actually are bottleneck operations, small enough for dereferencing to make a difference, and repeated hundreds of thousands to millions of times (more on my use case below, but for now let's just assume this is significant). So I don't want to use mutable data in the cases where I don't need it to be mutable. It would be nice if, from the outside, the code appeared not to care whether the initial parameters were mutable or immutable. Let's say, for simplicity's sake, that the function is the following:

(defn foo [a b]
    (fn [x] (* a b x)))
(def instance (foo 3 4))
(instance 5) ; <- 60
(instance 8) ; <- 96

I would like a foo that is smart enough to do this:

(def a (agent 3))
(def b (agent 4))
(foo a b) ; <- (fn [x] (* (deref a) (deref b) x))
(foo a 4) ; <- (fn [x] (* (deref a) 4 x))
(foo 3 4) ; <- (fn [x] (* 3 4 x))

However, my first attempt to do something used quoting and unquoting (natural, right? It's what macros use!), and it gave me a nasty error about embedding objects in code (a very similar issue, different use-case, is discussed here). My next attempt gave me a weird (and massive) slowdown in runtimes.

Does anyone know of a good way to do this?

Background

I am working some machine learning algorithms. In a typical scenario, the user would initialize an algorithm with a certain set of parameters, and then run it on a set of data. But sometimes a user/user-defined code might want to modify the parameters as the algorithm is running, either based on time (e.g., simulated annealing), or based on some other criteria determined while watching the algorithm's ongoing performance. My algorithms are parallelized, and each thread would need to see the change. Restarting the algorithm when I'm changing the parameters would defeat the purpose.

score 2 · Answer 1 · edited May 23 '17 at 12:06

With Eval

To get a foo smart enough to do what you want, you can use run-time expression modification:

(defn maybe-deref-expr 
  [vals params body] 
  (let [smap (zipmap params 
                     (map (fn [val sym] 
                            (if (instance? clojure.lang.IDeref val) 
                              (list 'deref sym) 
                              sym)) 
                          vals 
                          params))
        body* (clojure.walk/postwalk-replace smap body)
        gen (eval (list 'fn params body*))] 
    (apply gen vals)))

(defmacro defn-ref-agnostic
  [name params body]
  `(defn ~name
    ~params
    (maybe-deref-expr ~params '~params '~body)))

(defn-ref-agnostic foo
  [a b]
    (fn [x] (* a b x)))

(defn foo-baseline
  [a b]
    (fn [x] (* a b x)))

(def f (foo 3 4))
(def g (foo 3 4))

As far as I can tell on my machine, f and g have identical performance characteristics.

Without Eval

This appears to be working fairly efficiently:

(defn aref? [x] (instance? clojure.lang.ARef x))
(defn foo-wraps [& args]
    (map (fn [i] (if (aref? i)
                     #(deref i)
                     #(identity i)))
         args))
(defn foo [a b]
    (let [[a b] (foo-wraps a b)]
        (fn [x] (* (a) (b) x))))

I guess this might be an example of HotSpot coming to the rescue? If I don't pass any ARefs, then the performance is very close to the original formulation after only a handful of runs:

(def a (ref 3))
(def b (ref 4))
(def f (foo 3 4))
(def g (foo a b))
(defn h [x] (* 3 4 x))

user=> (time (dotimes [n 10000] (f n)))
"Elapsed time: 7.38648 msecs"
"Elapsed time: 3.45071 msecs"
"Elapsed time: 3.087424 msecs"
"Elapsed time: 2.836596 msecs"

user=> (time (dotimes [n 10000] (g n)))
"Elapsed time: 13.076024 msecs"  
"Elapsed time: 4.235882 msecs"
"Elapsed time: 4.517663 msecs"
"Elapsed time: 3.940946 msecs"

user=> (time (dotimes [n 10000] (h n)))
"Elapsed time: 4.056389 msecs"
"Elapsed time: 2.499129 msecs"
"Elapsed time: 3.064487 msecs"
"Elapsed time: 2.631167 msecs"

At your request, I posted the code from my answer to your related question. I actually prefer your original answer here (as commented on in my answer here). I don't entirely understand your use case though; was there a reason that was not sufficient? — A. Webb, Apr 13 '14 at 17:39
There were two things I didn't like about it: first, that I had to write `(* (a) (b) x)` instead of `(* a b x)`. That was fairly minor. Second, it was mostly just the principle of the thing -- one of the reasons I like functional languages is that I can write code that writes code for me. My productivity goes up and my bugs go down. But it seemed like I had hit a limitation of the functional power of Clojure, and I was frustrated. :-) I spent a long day on it -- I can't really justify that. But I learned a ton along the way! — galdre, Apr 13 '14 at 17:58
Well, I guess there is one more thing. As I thought about the problem and realized what I wanted -- a modified function according to certain rules -- I found a solution that would give me something very similar to that. But not exactly that. I don't like settling for "work-arounds" and "approximations." I like your code because it gives me the function I would write by hand if I were writing it by hand. — galdre, Apr 13 '14 at 18:09
Cool. Back one comment, for the sake of semantics and any future readers (I'm repeating what I'm almost sure you already know), your original solution was the "functional" one - composition of functions, closures, first-class and higher-order functions. The one I provided uses the "homoiconicity" and meta-programming capability of lisps, which is not a trait of all functional languages. — A. Webb, Apr 13 '14 at 18:14
I wrote Lisp first, then changed it to "functional" without much thought. But you're right. I just had a conversation with a Haskell programmer in which he argued that homoiconicity was an unnecessary property for a language. — galdre, Apr 13 '14 at 18:27

score 2 · Accepted Answer · edited May 23 '17 at 10:26

It seems I answered this question in the related question with the last maybe-deref-expr example there. That code is repeated in Timothy Dean's own answer here, along with some nice macro sugar he wrote for it, so definitely check out his answer too. Here's a slightly modified version of maybe-deref-expr, perhaps a bit easier to read.

(defn maybe-deref-expr 
  [values params body] 
  (let [valmap (zipmap params values)
        deref? #(instance? clojure.lang.IDeref %)
        body* (clojure.walk/postwalk 
                #(if (deref? (valmap %)) `(deref ~%) %) 
                body)
        gen (eval `(fn ~params ~body*))] 
    (apply gen values)))

With Timothy Dean's macro sugar

(defmacro defn-ref-agnostic 
  [name params body]
  `(defn ~name 
     ~params
     (maybe-deref-expr ~params '~params '~body)))

if we do

(defn-ref-agnostic add
  [a b]
  (+ a b))

Then we get a slow (eval hit) add that is automatically dereferences when needed

(add 40 2) ;=> 42
(add (ref 40) (atom 2)) ;=> 42

But, the use case is not to define functions themselves, but function generators that close over other parameters.

(defn-ref-agnostic add-to
  [a b]
  (fn [x] (+ a b x)))

Now if we do

(def baz1 (add-to 40 2))

(def my-ref (ref 40))
(def my-atom (atom 2))

(def baz2 (add-to my-ref my-atom))

Then we take the eval hit when baz1 and baz2 are defined, and not when they are subsequently used. The code produced for the definition of baz1 and baz2, and thus the performance of those when used, is exactly as if we had done

(def baz1 (fn [x] (+ 40 2 x)))
(def baz2 (fn [x] (+ @my-ref @my-atom x)))

That having been said...

The original "Without Eval" solution, if it fits your use case, is what I would prefer:

(defn foo [a b]
  (let [[fa fb] (map #(if (instance? clojure.lang.IDeref %) 
                        deref 
                        identity) 
                     [a b])]
      (fn [x] (+ (fa a) (fb b) x))))

This introduces an extra level of indirection only at the low, low cost of at most two extra identity function calls. It is a lot simpler than the above and can be very flexible. The main difference between this and the answer to the other related question is that the test/branching has been moved outside the returned function, which now closes over the results.

Using "maybe ARefs" in higher-order functions

2 Answers2

Linked