3

I'm implementing a data structure in memory that shadows part of a large data structure stored somewhere out on the web. Let's say the data structures in question are binary trees. I want the in-memory tree to initially consist of just the root node, and it should grow lazily by fetching nodes from the web on demand, as the user (or algorithm) explores.

One natural way to do this is for the tree node data type to provide methods getLeftChild(), getRightChild(), each of which immediately returns a promise for the respective child node. When getLeftChild() is called on a tree node whose left child is already in memory, it returns a promise that is already-resolved with the cached child; otherwise it initiates a fetch of the child (if not already initiated by an earlier call) and returns a promise for it, and when the fetched child eventually comes back from the web, the fetched child will be saved in memory for the future and used to resolve the promise.

So, to print the node 5 levels down the left branch, I'd say:

root.getLeftChild()
    .then(child0 => child0.getLeftChild())
    .then(child00 => child00.getLeftChild())
    .then(child000 => child000.getLeftChild())
    .then(child0000 => child0000.getLeftChild())
    .then(child00000 => {
  console.log("child00000 = ", child00000);
});

or (thanks @Thomas):

const lc = node => node.getLeftChild();
Promise.resolve(root)
    .then(lc).then(lc).then(lc).then(lc).then(lc)
    .then(child00000 => {
  console.log("child00000 = ", child00000);
});

Or, the same thing using async/await:

(async()=>{
  let child0 = await root.getLeftChild();
  let child00 = await child0.getLeftChild();
  let child000 = await child00.getLeftChild();
  let child0000 = await child000.getLeftChild();
  let child00000 = await child0000.getLeftChild();
  console.log("child00000 = ",child00000);
})();

This all works fine, and the calling code doesn't look too horrible in either case.

My only misgiving is that, when exploring within parts of the binary tree (or any similar linked data structure) that are already in memory, I don't want to suffer the overhead of initiating a new microtask every time I want to go from one node to a neighbor in the in-memory data structure. Think of an algorithm whose core computation does millions of such link-following operations.

Promises/A+ does indeed require a new microtask (at least) for each then callback execution:

2.2.4 onFulfilled or onRejected must not be called until the execution context stack contains only platform code. [3.1].

I believe async/await has a similar requirement.

What I'd like to know is: what's the easiest/cleanest way to make a Promise-like object that behaves exactly like a Promises/A+ promise, except for clause 2.2.4? I.e. I want it to have a then (or then-like) method that is "synchronous-when-available", so that the first code snippet above will execute in one shot without yielding the execution context.

To avoid naming issues/confusion, I'm happy to refrain from calling my synchronous-when-available accessor then (which is effectively a reserved word nowadays thanks to Promises/A+); instead, I'll call it thenOrNow. And I'll call my hypothetical type/implementation PromiseOrNow.

Would I have to write PromiseOrNow from scratch, or is there a neat and reliable way to leverage an existing Promises/A+ implementation such as native Promise?

Notice that, since I'm not planning to mess with anything named "then", PromiseOrNow could incidentally be Promises/A+ compliant, if that turns out to be a good way of doing it. Perhaps it would be a prototype interited from the native Promise.prototype. These properties would be nice in some ways, but they are not requirements.

Don Hatch
  • 5,041
  • 3
  • 31
  • 48
  • just one thing about your snippet, avoid callback hell: `root.getLeftChild().then(child0 => child0.getLeftChild()).then(child00 => child00.getLeftChild()).then(child000 => child000.getLeftChild()).then(child0000 => child0000.getLeftChild()).then(child00000 => { console.log("child00000 = ",child00000); });` or even better, DRY: `const lc = node => node.getLeftChild();` and then `Promise.resolve(root).then(lc).then(lc).then(lc).then(lc).then(lc).then(child00000 => { console.log("child00000 = ",child00000); });` – Thomas May 20 '17 at 11:51
  • @Thomas, thank you! Your rewrites are great. Obviously I'm still a novice with promises, or I would have noticed the callback hell. I'll edit these into the question and attribute to you. Any similar insights on the async/await version? – Don Hatch May 22 '17 at 10:18
  • Okay, I've edited in @Thomas's improvements. I don't see how to rewrite the await/async version without it turning into some kind of kind of nesting hell, so I'll leave that as is. – Don Hatch May 22 '17 at 20:56

3 Answers3

1

You could extend a standard promise with the thenOrNow method with the following wrapper function:

function addThenOrNow(p) {
    let value, resolved;
    p.then( response => (value = response, resolved = 1) )
     .catch( err => (value = err, resolved = -1) );
    p.thenOrNow = (fulfilled, rejected) => 
        resolved > 0 ? Promise.resolve(fulfilled ? fulfilled(value) : value)
        : resolved   ? Promise.reject (rejected  ? rejected (value) : value)
                     : p.then(fulfilled, rejected); // default then-behaviour
    return p;
}

// Demo 
const wait = ms => new Promise( resolve => setTimeout(resolve, ms) );
const addSlow = (a, b) => wait(100).then(_ => a + b);
const prom = addThenOrNow(addSlow(2, 3));

prom.then(value => console.log('promise for adding 2 and 3 resolved with', value));
setTimeout(_ => {
    // At this time the promise has been resolved.
    let sum;
    prom.then(response => sum = response);
    // above callback was executed asynchronously
    console.log('sum after calling .then is', sum); 
    prom.thenOrNow(response => sum = response);
    // above callback was executed synchronously
    console.log('sum after calling .thenOrNow is', sum); 
}, 200);

Instead of using a wrapper, you could create your own myPromise constructor, but the main logic is the same.

Concerning immediately resolved promises

The above implementation of thenOrNow will only be able to perform the callback synchronously if the promise resolves asynchronously (i.e. after you called addThenOrNow on the original promise), like would be your case (assuming your http requests are performed asynchronously). If however, the promise resolves immediately (synchronously), thenOrNow will not be able to synchronously get the value via the native Promise interface.

Other libraries, like bluebird provide methods for synchronous inspection, so if you include bluebird, you can provide a solution that also works for immediately resolving promises:

function addThenOrNow(p) {
    p.thenOrNow = (fulfilled, rejected) => 
        p.isFulfilled() ? Promise.resolve(fulfilled ? fulfilled(p.value()) : p.value())
        : p.isRejected()? Promise.reject (rejected  ? rejected (p.reason()) : p.reason())
                        : p.then(fulfilled, rejected); // default then-behaviour
    return p;
}

// Demo 
const prom = addThenOrNow(Promise.resolve(2+3));

let sum;
prom.then(response => sum = response);
console.log('sum after calling then is', sum);
prom.thenOrNow(response => sum = response);
console.log('sum after calling thenOrNow is', sum);
<script src="https://cdnjs.cloudflare.com/ajax/libs/bluebird/3.5.0/bluebird.min.js"></script>

But again, as your scenario is asynchronous in nature (fetching responses from HTTP requests) you could use either solution.

trincot
  • 317,000
  • 35
  • 244
  • 286
  • The problem is that when using `then` to set `resolved` (which should be called "`fulfilled`" btw), it still happens asynchronously only; so `addThenOrNow(Promise.resolve(…)).thenOrNow(…)` will not run the callback synchronously. – Bergi May 20 '17 at 12:24
  • Indeed, for promises that resolve immediately, this is not suitable. – trincot May 20 '17 at 12:29
  • @Bergi That's a good point, but I don't consider it a problem if the first thenOrNow() is delayed to the next microtask. I'm more concerned with the following 999,999 calls. Maybe I could make that clearer in the question. – Don Hatch May 22 '17 at 09:58
0

I don't want to suffer the overhead of initiating a new microtask every time

They're called "microtask" because they have microscopic overhead. The microtask queue is really fast, you should not worry. Better keep consistency instead of falling for Zalgo.

What's the easiest/cleanest way to make a Promise-like object that behaves exactly like a Promises/A+ promise, except for clause 2.2.4?

Use an existing implementation that features this. For example, Q v0.6 contained an asap method.

Would I have to write PromiseOrNow from scratch

No, you can start out with a Promise library that suits you.

Is there a neat and reliable way to leverage an existing Promises/A+ implementation such as native Promise?

No, or at least not from its public API when it doesn't feature synchronous inspection.

Community
  • 1
  • 1
Bergi
  • 630,263
  • 148
  • 957
  • 1,375
  • Regarding micro-ness of microtasks: I ran some simple timings of link following in a simple linked list, in chrome 59 on a macbook. it looks like I can do 90,000 `node = node.next`s per second, vs. 80 `node = await node.next`s per second. That's a slowdown of more than 1000x, which isn't surprising to me, given my rough idea of what's involved in a microtask creation and context switch. Is it surprising to you? – Don Hatch May 22 '17 at 20:03
  • Yes, that does surprise me. I would have expected a slowdown of maybe 2 to 10 times. However, I guess that the largest overhead is the promise creation itself - the implicit `Promise.resolve` call in the `await` - which isn't optimised away in current implementations. You would need to compare against a test case that creates promise objects but uses synchronous inspection, like your `thenOrNow` method would. – Bergi May 22 '17 at 23:21
0

Sorry about the delay, but I was busy. How about a different approach to tackle the actual Problem? Instead of trying to resolve the Promises in a sync manner, to scrape off a few ms, how about seperating the sync and async parts of the tasks.

To be precise: The async part here is to load the data for a particular node in the binary tree. Even if the Tree is built lazily, the traversion does not have to be async.

So we can decouple the traversion and the lazy generations of the tree from the async data loading.

//sync traversion:
var node = root.getOrCreate('left', 'right', 'right', 'left', 'right');

//wich is a shorthand for the more verbose:
var child0 = root.getOrCreateLeft();
var child01 = child0.getOrCreateRight();
var child011 = child01.getOrCreateRight();
var child0110 = child011.getOrCreateLeft();
var node = child0110.getOrCreateRight();

To this point everything is (although lazy) good old fashioned sync code. Now the async part, accessing the data of the node.

node.then(nodeData => console.log("data:", nodeData));
//or even
var nodeData = await node;
console.log(nodeData);
//or
var data = await root.getOrCreate('left', 'right', 'right', 'left', 'right');

To the implementaion:

class AsyncLazyBinaryTree {
    constructor(config, parent=null){
        if(typeof config === "function")
            config = {load: config};

        //tree strucute
        this.parent = parent;
        this.left = null;
        this.right = null;

        //data-model & payload
        this.config = config;
        this._promise = null;

        //start loading the data
        if(config.lazy || config.lazy === undefined) 
            this.then();
    }

    get root(){
        for(var node = this, parent; parent=node.parent; node = parent);
        return node;
    }       

///// These methods are responsible for the LAZY nature of this tree /////

    getOrCreateLeft(){ return _goc(this, "left") }

    getOrCreateRight(){ return _goc(this, "right") }

    getOrCreate(...args){
        if(args.length === 1 && Array.isArray(args[0]))
            args = args[0];

        var invalid = args.find(arg => arg !== "left" && arg !== "right");
        if(invalid)
            throw new Error("invalid argument "+ invalid);

        for(var node = this, i=0; i<args.length; ++i)
            node = _goc(node, args[i]);

        return node;
    }

///// These methods are responsible for the ASYNC nature of this tree /////

    //If this node looks like a promise, quacks like a promise, walks like a promise, ... 
    //you can use it as a Promise of the data they represent
    then(a,b){ 
        if(!this._promise){
            this._promise = Promise.resolve().then(() => this.config.load(this));
        }

        return this._promise.then(a,b);
    }

    catch(fn){ return this.then(null, fn); }    

    //to force the node to reload the data
    //can be used as `node.invalidate().then(...)`
    //or `await node.invalidate()`
    invalidate(){
        this._promise = null;
        return this;
    }

}

//private
function _goc(node, leftOrRight){
    if(!node[leftOrRight])
        node[leftOrRight] = new AsyncLazyBinaryTree(node.config, node);
    return node[leftOrRight];
}

And a basic example

//A utility to delay promise chains.
/* use it as:   somePromise.then(wait(500)).then(...)
    or          wait(500).then(...);
    or          wait(500).resolve(value).then(...)
    or          wait(500).reject(error).then(...);
*/
var wait = ((proto) => delay => Object.assign(value => new Promise(resolve => setTimeout(resolve, delay, value)), proto))({
    then(a,b){ return this().then(a,b) },
    resolve(value){ return this(value) },
    reject(error){ return this(error).then(Promise.reject.bind(Promise)) }
});




//initializing a tree
var root = new AsyncLazyBinaryTree({
    //load the data as soon as the Node is generated
    lazy: true, 

    //this method will be called (once) for each node that needs its data.
    load(node){
        var path = this.getPath(node);

        console.log('load', path, node);

        //create an artificial delay, then return the payload
        return wait(1500).resolve({
            ts: new Date(),
            path: path
        });

        //but maybe you need some data from the parentNode, to actually load/generate the current data:

        //node.parent is `null` for the root node, 
        //that's why I wrap that into a Promise.resolve()
        //so for the rootNode, parentData is now null;
        return Promise.resolve(node.parent)
            .then(parentData => {
                //do something with the parentData
                return wait(500).resolve({
                    ts: new Date(),
                    path: path,
                    parent: parentData,
                });
            });
    },


    //an utility to be used by load(). 
    //the tree doesn't care if you add methods or data to the config
    //it's all passed through the whole tree.
    getPath(node){
        var path = "";
        for(var n = node, p; p = n.parent; n=p){
            var leftOrRight = n === p.left? "left": n === p.right? "right": "";
            if(!leftOrRight) throw new Error("someone messed up the tree");
            path = "." + leftOrRight + path;
        }
        return "root" + path;
    },
});

var node = root.getOrCreate("left", "right", "left");

//just to be perfectly clear
//the config is simply passed through the tree, and you can (ab)use it to store additional data about the tree.
console.log("same config", root.config === node.config);

node.then(nodeData => console.log("data:", nodeData));

I'm not good at making up examples. Play a bit with the class and modify/extend it as you wish

Thomas
  • 11,958
  • 1
  • 14
  • 23