10

I am interested in the scenario where we have some function f which is recursive and which we are not provided the source code to.

I would like a function memoizer: Function -> Function which takes in say f and returns a function g such that g = f (in the sense they return the same value given the same arguments) which when called first checks if the called arguments are in its 'cache' (memory of results it has calculated before) and if so returns the result from this, otherwise it should compute f, should f call itself with some arguments, this is tantamount to calling g with those arguments and I would like that f first check if the cache of g contains those arguments and if so return the result from this, otherwise ...

This is easy (in Javascript) to do given the source code of f, I simply define memoize in the obvious way and do something like

let f = memoize((...args) => {/* source code of f */});

But this doesn't appeal to me at all (mainly because I might want a memoized and non memoized version of the same function and then I'd have to write the same function twice) and won't work if I don't know how to implement f.

In case it's not clear what I'm asking,

I would like a function memoize which takes a function such as

fact = n => n === 0 ? 1 : n * fact(n - 1);

And returns some new function g such that fact(n) = g(n) for all n and which for example when g(10) is computed stores the values of fact(0), ..., fact(10) which are computed while computing g(10) and then if I ask for say g(7) it finds the result in the cache and returns it to me.

I've thought that conceptually it's possible to detect when f is called since I have it's address and maybe I could replace all calls to f with a new function where I compute f and store the result and then pass the value on to where it would normally go. But I don't know how to do this (and it sounds unpleasant).

Countingstuff
  • 733
  • 5
  • 14
  • In short you need to look at the Y Combinator concept. I explained it step by step in JS under question [How does Y-combinator compute the fixed point programmatically?](https://stackoverflow.com/a/41412648/4543207) – Redu Oct 19 '18 at 17:05

3 Answers3

5

I might want a memoized and non memoized version of the same function and then I'd have to write the same function twice

Yes, you need to. The recursive call to fact(n - 1) inside the function can only refer to one fact function - either a memoized or an unmemoized one.

So what you need to do to avoid code duplication is define fact with the Y combinator:

const makeFact = rec => n => n === 0 ? 1 : n * rec(n - 1);
//               ^^^                           ^^^

const factA = Y(makeFact);
const factB = memoizingY(makeFact);

function Y(make) {
    const f = make((...args) => f(...args)); // const f = make(f) is easier to understand
    return f;                                // but doesn't work with eager evaluation
}

I'll leave the definition of memoizingY as an exercise to the reader :-)


Possibly simpler approach:

const makeFact = annotate => {
    const f = annotate(n => n === 0 ? 1 : n * f(n - 1));
    return f;
}

const factA = makeFact(identity);
const factB = makeFact(memoize);
Bergi
  • 630,263
  • 148
  • 957
  • 1,375
  • Very clever. This is cool thank you, this Y combinator thing is extremely cute. This resolves the issue of duplication for sure (which was secretly my main concern). This is a good lesson in writing suitably high level functions I feel. – Countingstuff Oct 18 '18 at 20:22
  • @rici You *can* use a memoized `f` - that's exactly what `memoizingY` should do. We need the Y combinator functionality here so that `makeFact` can create multiple `fact` functions that use different functions for their `rec`ursive calls. – Bergi Oct 18 '18 at 21:08
  • Can this be used to memoize a given function that was not programmed as you suggest in `makeFact`, such as the example the OP provided or `fib` in my answer? – גלעד ברקן Oct 19 '18 at 22:24
  • @גלעדברקן No, to optimise the recursive calls in an arbitrary function you must [overwrite the original variable](https://stackoverflow.com/a/24489141/1048572) with the memoised function. If that's not possible, the recursive calls cannot be affected. – Bergi Oct 20 '18 at 09:49
5

maybe I could replace all calls to f with a new function where I compute f and store the result and then pass the value on to where it would normally go.

This is actually very easy to do, as Bergi referred to in a comment.

// https://stackoverflow.com/questions/24488862/implementing-automatic-memoization-returns-a-closured-function-in-javascript/ 
function memoize(func) {
  var memo = {};
  var slice = Array.prototype.slice;

  return function() {
    var args = slice.call(arguments);

    if (args in memo)
      return memo[args];
    else
      return (memo[args] = func.apply(this, args));

  }
}

function fib(n) {
  if (n <= 1) return 1;
  return fib(n - 1) + fib(n - 2);
}

fib = memoize(fib);

console.log(fib(100));
גלעד ברקן
  • 23,602
  • 3
  • 25
  • 61
1

In my limited experience, we do have access to JavaScript source code. We could thus attempt to generate new source code for the memoized function.

// Redefine Function.prototype.bind
// to provide access to bound objects. 
// https://stackoverflow.com/questions/7616461/generate-a-hash-from-string-in-javascript
var _bind = Function.prototype.apply.bind(Function.prototype.bind);
Object.defineProperty(Function.prototype, 'bind', {
    value: function(obj) {
        var boundFunction = _bind(this, arguments);
        boundFunction.boundObject = obj;
        return boundFunction;
    }
});

// Assumes the parameters for the function,
// f, can be consistently mapped.
function memo(f){
  if (!(f instanceof Function))
    throw TypeError('Argument is not an instance of Function.');
    
  // Generate random variable names
  // to avoid conflicts with unknown
  // source code
  function randomKey(numBytes=8){        
      let ranges = [[48, 10], [65, 26], [97, 26]]; 
      let key = '_';

      for (let i=0; i<numBytes; i++){     
          let idx = Math.floor(Math.random() * ranges.length);
          key += String.fromCharCode(ranges[idx][0] + Math.random() * ranges[idx][1]);
      }   

      return key;
  }

  let fName = f.name;
  let boundObject;
  let fCode;
  
  const nativeCodeStr = '(){[nativecode]}';
  
  // Possible Proxy
  try {
    fCode = f.toString();

  } catch(error){
    if (error.constructor == TypeError){
      if (Function(`return ${ fName }.toString()`)() != nativeCodeStr){
        throw TypeError(`Possible Proxy detected: function has a name but no accessible source code. Consider memoizing the target function, ${ fName }.`);
      
      } else {
        throw TypeError(`Function has a name but no accessible source code. Applying toString() to its name, ${ fName }, returns '[native code]'.`);
      }
    
    } else {
      throw Error('Unexpected error calling toString on the argument.');
    }
  }

  if (!fName){
    throw Error('Function name is falsy.');

  // Bound functions
  // Assumes we've monkey-patched
  // Function.prototype.bind previously
  } else if (fCode.replace(/^[^(]+|\s+/g, '') == nativeCodeStr){
    if (/^bound /.test(fName)){
      fName = fName.substr(6);
      boundObject = f.boundObject;
      // Bound functions return '[native code]' for
      // their toString method call so get the code
      // from the original function.
      fCode = Function(`return ${ fName }.toString()`)();
    
    } else {
      throw Error("Cannot access source code, '[native code]' provided.");
    }
  }

  const fNameRegex = new RegExp('(\\W)' + fName + '(\\W)', 'g');
  const cacheName = randomKey();
  const recursionName = randomKey();
  const keyName = randomKey();

  fCode = fCode.replace(/[^\(]+/,'')
    .replace(fNameRegex, '$1' + recursionName + '$2')
    .replace(/return/g, `return ${ cacheName }[${ keyName }] =`)
    .replace(/{/, `{\n  const ${ keyName } = Array.from(arguments);\n\n  if (${ cacheName }[${ keyName }])\n    return ${ cacheName }[${ keyName }];\n`);
  
  const code = `function(){\nconst ${ cacheName } = {};\n\nfunction ${ recursionName + fCode }\n\nreturn ${ recursionName }.apply(${ recursionName }, arguments);}`;

  let g = Function('"use strict";return ' + code)();

  if (boundObject){
    let h = (g).bind(boundObject);
    h.toString = () => code;
    return h;

  } else {
    return g;
  }
} // End memo function


function fib(n) {
  if (n <= 1) return 1;
  return fib(n - 1) + fib(n - 2);
}

const h = fib.bind({a: 37});
const g = memo(h);
   
console.log(`g(100): ${ g(100) }`);
console.log(`g.boundObject:`, g.boundObject);
console.log(`g.toString():`, g.toString());

try{ 
  memo(function(){});

} catch(e){
  console.log('Caught error memoizing anonymous function.', e)
}

const p = new Proxy(fib, {
  apply: function(target, that, args){
    console.log('Proxied fib called.');
    return target.apply(target, args);
  }
});

console.log('Calling proxied fib.');
console.log(`p(2):`, p(2));

let memoP;

try {
  memoP = memo(p);
  
} catch (e){
  console.log('Caught error memoizing proxied function.', e)
}
גלעד ברקן
  • 23,602
  • 3
  • 25
  • 61
  • Nice, I definitely like this, it's resolvable but at present there will be some issue if say the body of f has a variable called g, cache or key but other than that it looks like it works to me provided String(f) provides you the code you want. For a reasonable (? in my head, I'm not a JS programmer) instance where String(f) does not provide what you want, if I bind a function to some object (e.g. h = fib.bind({}) then all I get back is "function () { [native code] }" – Countingstuff Oct 19 '18 at 08:49
  • @Countingstuff good point about bind. Let me see about that. Regarding variable name conflicts, since we're generating code as a string, it would be easy to just assign a random enough string for each of those three variables. – גלעד ברקן Oct 19 '18 at 09:44
  • @Countingstuff for bound functions, we can extract the original function name (and thus get that source code if it was not native) from the result of `[bound_function].name`. – גלעד ברקן Oct 19 '18 at 10:37
  • @Countingstuff and if we wanted to get the bound object to fully recreate the bound, now cached, function, there's at least one method described here that could pave the way: https://stackoverflow.com/a/14309359/2034787 – גלעד ברקן Oct 19 '18 at 10:43
  • @Countingstuff now let me see about functions created by Proxy, although I think we've covered the most common use cases. This is pretty cool. I might implement it for a utilisation library. – גלעד ברקן Oct 19 '18 at 10:45
  • Very good, I am satisfied with this answer and will mark it as so. It would still be nice to see something that works with Proxy (and indeed it would be nice to see a solution not in the vein of ‘rewriting the old function’ if you get my meaning). Thank you. – Countingstuff Oct 19 '18 at 10:50
  • @Countingstuff if you want to see more solutions, I would leave the question open at least for a while rather than mark it as answered. More people tend to look at those (I know I do). – גלעד ברקן Oct 19 '18 at 10:54
  • @Countingstuff with regards to functions used by Proxy, it seems to me that while there is a way to at least suspect it (the function name is the same as the Proxy's target yet the code string is unavailable), there is no reliable way to get at or monkey-patch its handler. The solution in that case is reassign the original function to its memoized version. The proxy will automatically point to the modified version without any modification. Anyway, Proxy's do not access function internals, only intercept the first (not any recursive) call. – גלעד ברקן Oct 19 '18 at 22:31
  • @Countingstuff I think Proxies' mystery may be a security measure since they can be effectively used as revocable handlers, without exposing their internals to outside parties. I will try to work on a slightly more robust version of the code suggestion a little later. – גלעד ברקן Oct 19 '18 at 22:33
  • @Countingstuff kk, added a basic attempt at slightly more robust handling, including the variable conflict issue and bound functions. Thanks for an interesting challenge! – גלעד ברקן Oct 20 '18 at 04:25