10

Going to do my best at explaining what I am trying to do.

I have two models, mine and an api response I am receiving. When the items api response comes in, I need to map it to my model and inserts all the items. This is simple of course. Heres the issue, I need to do so without really knowing what I am dealing with. My code will be passed in two strings, one of my models mapping path and one of the api response mapping path.

Here are the two paths

var myPath = "outputModel.items[].uniqueName"
var apiPath = "items[].name"

Basically FOR all items in apiPath, push into items in myPath and set to uniqueName

What it comes down to is that my code has NO idea when two items need to be mapped, or even if they contain an array or simple field to field paths. They could even contain multiple arrays, like this:

******************** EXAMPLE *************************

var items = [
    {
        name: "Hammer",
        skus:[
            {num:"12345qwert"}
        ]
    },
    {
        name: "Bike",
        skus:[
            {num:"asdfghhj"},
            {num:"zxcvbn"}
        ]
    },
    {
        name: "Fork",
        skus:[
            {num:"0987dfgh"}
        ]
    }
]

var outputModel = {
    storeName: "",
    items: [
        {
            name: "",
            sku:""
        }
    ]
};


outputModel.items[].name = items[].name;
outputModel.items[].sku = items[].skus[].num;

************************ Here is the expected result of above

var result = {
    storeName: "",
    items: [
        {
            name: "Hammer",
            sku:"12345qwert"
        },
        {
            name: "Bike",
            sku:"asdfghhj"
        },
        {
            name: "Bike",
            sku:"zxcvbn"
        },
        {
            name: "Fork",
            sku:"0987dfgh"        }
    ]
};

I will be given a set of paths for EACH value to be mapped. In the case above, I was handed two sets of paths because I am mapping two values. It would have to traverse both sets of arrays to create the single array in my model.

Question - How can I dynamically detect arrays and move the data around properly no matter what the two model paths look like? Possible?

Rob
  • 11,185
  • 10
  • 36
  • 54
  • In your last example, what are you expecting your output model to look like at the end? – S McCrohan May 07 '15 at 23:44
  • Why not have just one structure to outputModel? Even if the only property avaiable is the name, it could be set as [{name: 'name'}]. Probably will be easier to work and mantain your applications with just one type structure. – Diego ZoracKy May 07 '15 at 23:44
  • Are you saying that (in your first example) the `outputModel.items` should become `[{uniqueName: "Hammer"}, {uniqueName: "Bike"}, {uniqueName: "Fork"}]` for your given `items` input? – Bergi May 07 '15 at 23:44
  • 2
    @SMcCrohan I added the expected output. Note that I am ok if I need to run a piece of code 2 times because there is 2 sets of path like in example 2 – Rob May 07 '15 at 23:47
  • @DiegoZoracKy unfortunately thats not possible. Along with the two string paths, I am given the two models I need to work with as well, which is managed by a different part of the application. – Rob May 07 '15 at 23:49
  • @Bergi Yes, thats correct. See the expected output I added for example 2 – Rob May 07 '15 at 23:50
  • 1
    @Rob: The problem is that your notation is ambiguous once if contains unequal numbers of arrays on the left and on the right. It seems you're just taking the cartesian product? Please specify exactly how such cases should be handled. – Bergi May 08 '15 at 00:00
  • @rob, in your example 2, could you possibly post the output after each step? It is not clear how the sku's got where they landed in the second step - because no match criteria is specified (you are matching on "name" I assume, or matching on all remaining properties? E.g., if there was a property "status" that was peer to "name" then how would match go?) – Dinesh May 08 '15 at 00:12
  • @Dinesh I dont have a specific expectation of each steps output. Been staring at this for 2 days and pretty open to suggestion. 2part comment: My initial thought is that it would iterate the steps, where step one passed in the first set of paths plus the original empty model and the items array basically setting the items.name which would be 3 objects in the array because of 3 objects in items array. – Rob May 08 '15 at 00:30
  • It would then call the same function and pass in the modified model(now contains some items) along with the second set of paths. It would then evaluate each skus array and either set the value of sku or clone the object and if there is no object to support additional skus. i.e. items[1].skus[1].num. – Rob May 08 '15 at 00:31
  • Not quite grokking. What are the inputs and their influence on the expected output? Specifically, I'm not sure what `outputModel.items[].sku = items[].skus[].num;` and `outputModel.items[].name = items[].name;` are doing, exactly. Were those supposed to be strings, as in the earlier example? – ruffin May 10 '15 at 02:26
  • @Rob in all cases will the result be an array of objects ? – ProllyGeek May 10 '15 at 02:29
  • @ruffin Those are the 2 path strings. See example 2. It has those paths, the starting data as well as expected output. – Rob May 10 '15 at 02:33
  • @ProllyGeek No, it can be a simple single value mapping as well. i.e. 'output.name' = 'source.fullname' – Rob May 10 '15 at 02:36
  • Your expected output from 2nd example would require some sort of relationship to be created between the first set of path and the second set.Else, if one were to consider the 2 set of path in isolation, the output may become unrelated. That is, there are 3 name (which are Hammer,Bike,Fork) but 4 skus[].num(which are 12345qwert,asdfghhj,zxcvbn,0987dfgh) in the apiPath object. – sujit May 12 '15 at 18:40
  • An unrelated logic will probably distribute and create an object as:`items: [ { name: "Hammer", sku:"12345qwert" }, { name: "Bike", sku:"asdfghhj" }, { name: "Fork", sku:"zxcvbn" /*Sequentially 3rd*/ }, { sku:"987dfgh" /*No name here*/ } ]` Is this acceptable output to you? – sujit May 12 '15 at 18:41
  • I think in the 2nd example it should be `outputModel.items[].sku = items[].skus[0].num;` because otherwise the result would be something else: it could be interpreted as an array with the contents of num is assigned to sku – maraca May 12 '15 at 21:52
  • @sujit What your saying makes sense and is actually the problem I am facing this very minute. I have been able to look up all the values need in the incoming json and know where it is supposed to go, however its what record does what data go in. – Rob May 12 '15 at 22:13
  • @Rob: Sorry, while this is an interesting question and I'd like to answer it, it still lacks detail on how the accessor path strings should be interpreted *generically*. You've only provided one example, but I cannot tailor a generic algorithm to it. What would e.g. `outputModel.items.sku = items[].skus[].num;` do? – Bergi May 13 '15 at 23:01
  • @Bergi your example of `outputModel.items.sku = items[].skus[].num;` would make a `items` object for every `num` in `skus`. I actually have this example in my original post. This post is getting quote complicated with all the different approaches and examples. None of which hitting the mark yet. I have altered my original posted to mark **********EXAMPLE****** of what this output should render. – Rob May 13 '15 at 23:19
  • 1
    @Rob There is a small problem with `outputModel.items[].sku = items[].skus[].num;` becuase in the example you gave for the name Bike, the value of sku should be an array of nums not only the first one as it is in your example. Basically running a bit of regexp ( bleah ) on the paths you could define a proper semantic, otherwise you will run into a lot of inconsistencies. It is interesting but not doable unless you have a proper semantic for the attribution and routing system. I will try tomorrow to come with an actual answer. ( BTW hello @Bergi, long time no see :D ) – helly0d May 13 '15 at 23:25
  • @Rob: No, you have the `outputModel.items[].sku = items[].skus[].num` example in your question. Mine did not have any array brackets on the left side. Also, what should happen with things like `outputModel.items[].subitems[].sku = items[].subitems[].skus[].num` or `outputModel.items[].sku = items.skus.num`? (all of them on different input structures than yours of course). As helly0d says, you have given no proper expected semantics for arbitrary paths. And without such, your best bet would be to write a dedicated function that simple does exactly what you need for your specific data structure. – Bergi May 13 '15 at 23:29
  • Your specification is not sufficient. The fact that there are multiple levels to get the sku complicates things since the results are interdependent on the paths. Does this _have_ to be done in javascript? Tools like [jq](http://stedolan.github.io/jq/) will do this easily in a well-behaved manner and easy to specify and have all these corner cases handled. `{ storeName: "", items: map({name, sku: .skus[].num}) }` – Jeff Mercado May 13 '15 at 23:50
  • Sorry @Bergi, my eyes got me. In your example, I suppose it would create an array of outputModel. One for each `num`. In your most recent 2 examples, the first `outputModel.items[].subitems[].sku = items[].subitems[].skus[].num` would increase the amount of `items` (first array) because for every recursion, your adding to any arrays above. The second example `outputModel.items[].sku = items.skus.num` would make `items` only contain one item because the source data maps directly to a single value. – Rob May 13 '15 at 23:53
  • @JeffMercado yes, javascript. I am not trying to establish a query language, what this is all coming down to is being able to discover and handle arrays buried down in json, and build an output that matches a specific model. for example, this works and is something we all do every day `data.field = obj.someField`. Its simple, and it works because its a single value. now take this example `data.myArray[].field = array[].field`. At least for me, its obvious that I want to fill `myArray` with objects from the `array[].field`. The code needs to look at this and action. sorry, rched char limit – Rob May 14 '15 at 00:05
  • @Rob Your expected result above for Fork has no sku name. is that a mistake? `{ name: "Fork", 0987dfgh } ` – Tom Marulak May 14 '15 at 14:27
  • @TomMarulak Yes, sorry. Fixed – Rob May 14 '15 at 14:43

5 Answers5

2

So you have defined a little language to define some data addressing and manipulation rules. Let's think about an approach which will allow you to say

access(apiPath, function(value) { insert(myPath, value); }

The access function finds all the required items in apiPath, then calls back to insert, which inserts them into myPath. Our job is to write functions which create the access and insert functions; or, you could say, "compile" your little language into functions we can execute.

We will write "compilers" called make_accessor and make_inserter, as follows:

function make_accessor(program) {

  return function(obj, callback) {

    return function do_segment(obj, segments) {
      var start    = segments.shift()             // Get first segment
      var pieces   = start.match(/(\w+)(\[\])?/); // Get name and [] pieces
      var property = pieces[1];
      var isArray  = pieces[2];                   // [] on end
      obj          = obj[property];               // drill down

      if (!segments.length) {                     // last segment; callback
        if (isArray) {
          return obj.forEach(callback);
        } else {
          return callback(obj);
        }
      } else {                                    // more segments; recurse
        if (isArray) {                            // array--loop over elts
          obj.forEach(function(elt) { do_segment(elt, segments.slice()); });
        } else {
          do_segment(obj, segments.slice());      // scalar--continue
        }
      }
    }(obj, program.split('.'));
  };
}

We can now make an accessor by calling make_accessor('items[].name').

Next, let's write the inserter:

function make_inserter(program) {

  return function(obj, value) {

    return function do_segment(obj, segments) {
      var start    = segments.shift()             // Get first segment
      var pieces   = start.match(/(\w+)(\[\])?/); // Get name and [] pieces
      var property = pieces[1];
      var isArray  = pieces[2];                   // [] on end

      if (segments.length) {                      // more segments
        if (!obj[property]) {
          obj[property] = isArray ? [] : {};
        }
        do_segment(obj, segments.slice());
      } else {                                    // last segment
        obj[property] = value;
      }
    }(obj, program.split('.'));
  };
}

Now, you can express your whole logic as

access = make_accessor('items[].name');
insert = make_inserter('outputModel.items[].uniqueName');

access(apiPath, function(val) { insert(myPath, val); });
  • I really like where you were going with this, however I could not get it to work. I posted the exact code I tried to test with in the *****Testing Solution****** – Rob May 08 '15 at 18:51
  • After ripping this answer apart, I got it working with some minor modification, however it is not producing proper results. Additionally, it breaks with a path containing two arrays in it like example 2. Maybe the modifications I made veered off your target path? It seems like were close. I updated my Question with the somewhat working version. – Rob May 09 '15 at 21:23
  • `make_inserter` is buggy; it needs to know to create objects as elements of newly inserted arrays. Sorry for the trouble. I will fix this when I get the chance. –  May 10 '15 at 02:03
0

I have borrowed earlier answer and made improvements so as to solve both your examples and this should be generic. Though if you plan to run this sequencially with 2 sets of inputs, then the behavior will be as I have outlined in my comments to your original question.

    var apiObj = {
    items: [{
        name: "Hammer",
        skus: [{
            num: "12345qwert"
        }]
    }, {
        name: "Bike",
        skus: [{
            num: "asdfghhj"
        }, {
            num: "zxcvbn"
        }]
    }, {
        name: "Fork",
        skus: [{
            num: "0987dfgh"
        }]
    }]
};

var myObj = { //Previously has values
    storeName: "",
    items: [{
        uniqueName: ""
    }],
    outputModel: {
        items: [{
            name: "Hammer"
        }]
    }
};

/** Also works with this **
var myPath = "outputModel.items[].uniqueName";
var apiPath = "items[].name";
*/
var myPath = "outputModel.items[].sku";
var apiPath = "items[].skus[].num";

function make_accessor(program) {

    return function (obj, callback) {
        (function do_segment(obj, segments) {
            var start = segments.shift() // Get first segment
            var pieces = start.match(/(\w+)(\[\])?/); // Get name and [] pieces
            var property = pieces[1];
            var isArray = pieces[2]; // [] on end
            obj = obj[property]; // drill down

            if (!segments.length) { // last segment; callback
                if (isArray) {
                    return obj.forEach(callback);
                } else {
                    return callback(obj);
                }
            } else { // more segments; recurse
                if (isArray) { // array--loop over elts
                    obj.forEach(function (elt) {
                        do_segment(elt, segments.slice());
                    });
                } else {
                    do_segment(obj, segments.slice()); // scalar--continue
                }
            }
        })(obj, program.split('.'));
    };
}

function make_inserter(program) {

    return function (obj, value) {
        (function do_segment(obj, segments) {
            var start = segments.shift() // Get first segment
            var pieces = start.match(/(\w+)(\[\])?/); // Get name and [] pieces
            var property = pieces[1];
            var isArray = pieces[2]; // [] on end
            if (segments.length) { // more segments
                if (!obj[property]) {
                    obj[property] = isArray ? [] : {};
                }
                do_segment(obj[property], segments.slice());
            } else { // last segment
                if (Array.isArray(obj)) {
                    var addedInFor = false;
                    for (var i = 0; i < obj.length; i++) {
                        if (!(property in obj[i])) {
                            obj[i][property] = value;
                            addedInFor = true;
                            break;
                        }
                    }
                    if (!addedInFor) {
                        var entry = {};
                        entry[property] = value;
                        obj.push(entry);
                    }
                } else obj[property] = value;
            }
        })(obj, program.split('.'));
    };
}

access = make_accessor(apiPath);
insert = make_inserter(myPath);

access(apiObj, function (val) {
    insert(myObj, val);
});

console.log(myObj);
sujit
  • 2,258
  • 1
  • 15
  • 24
  • I have not gone through the code, but the output of this still doesnt quite hit it. Notice how items array 1,2,3 are missing name property. "{ "storeName": "", "items": [ { "uniqueName": "" } ], "outputModel": { "items": [ { "name": "Hammer", "sku": "12345qwert" }, { "sku": "asdfghhj" }, { "sku": "zxcvbn" }, { "sku": "0987dfgh" } ] } }" – Rob May 12 '15 at 22:19
  • @Rob, i think you missed something. The input i have taken in this example doesn't have name in the mapping set. In one of your earlier comments, you have mentioned that if there are multiple sets, you are willing to run the program multiple times with the given {myPath, apiPath} set. So, if you run this program again with `var myPath = "outputModel.items[].uniqueName"; var apiPath = "items[].name";` set, and myObj from the 1st run, then as per the mapping set, the myObj object will have name property in items array 1,2,3. – sujit May 13 '15 at 05:01
  • IF we had to run it PER set of paths, we would have to pass the result of the prior back into it so that all results end up in the same obj. – Rob May 13 '15 at 21:42
0

(old solution: https://jsfiddle.net/d7by0ywy/):

Here is my new generalized solution when you know the two objects to process in advance (called inp and out here). If you don't know them in advance you can use the trick in the old solution to assign the objects on both sides of = to inp and out (https://jsfiddle.net/uxdney3L/3/).

Restrictions: There has to be the same amount of arrays on both sides and an array has to contain objects. Othewise it would be ambiguous, you would have to come up with a better grammar to express rules (or why don't you have functions instead of rules?) if you want it to be more sophisticated.

Example of ambiguity: out.items[].sku=inp[].skus[].num Do you assign an array of the values of num to sku or do you assign an array of objects with the num property?

Data:

rules = [
  'out.items[].name=inp[].name',
  'out.items[].sku[].num=inp[].skus[].num'
];

inp = [{
    'name': 'Hammer',
    'skus':[{'num':'12345qwert','test':'ignore'}]
  },{
    'name': 'Bike',
    'skus':[{'num':'asdfghhj'},{'num':'zxcvbn'}]
  },{
    'name': 'Fork',
    'skus':[{'num':'0987dfgh'}]
}];

Program:

function process() {
  if (typeof out == 'undefined') {
    out = {};
  }
  var j, r;
  for (j = 0; j < rules.length; j++) {
    r = rules[j].split('=');
    if (r.length != 2) {
      console.log('invalid rule: symbol "=" is expected exactly once');
    } else if (r[0].substr(0, 3) != 'out' || r[1].substr(0, 3) != 'inp') {
      console.log('invalid rule: expected "inp...=out..."');
    } else {
      processRule(r[0].substr(3).split('[]'), r[1].substr(3).split('[]'), 0, inp, out);
    }
  }
}

function processRule(l, r, n, i, o) { // left, right, index, in, out
  var t = r[n].split('.');
  for (var j = 0; j < t.length; j++) {
    if (t[j] != '') {
      i = i[t[j]];
    }
  }
  t = l[n].split('.');
  if (n < l.length - 1) {
    for (j = 0; j < t.length - 1; j++) {
      if (t[j] != '') {
        if (typeof o[t[j]] == 'undefined') {
          o[t[j]] = {};
        }
        o = o[t[j]];
      }
    }
    if (typeof o[t[j]] == 'undefined') {
      o[t[j]] = [];
    }
    o = o[t[j]];
    for (j = 0; j < i.length; j++) {
      if (typeof o[j] == 'undefined') {
          o[j] = {};
      }
      processRule(l, r, n + 1, i[j], o[j]);
    }
  } else {
    for (j = 0; j < t.length - 1; j++) {
      if (t[j] != '') {
        if (typeof o[t[j]] == 'undefined') {
          o[t[j]] = {};
        }
        o = o[t[j]];
      }
    }
    o[t[j]] = i;
  }
}

process();
console.log(out);
maraca
  • 8,468
  • 3
  • 23
  • 45
  • Hi Maraca, very close, but this right here kills it 'items[].skus[0].num'. If I remove the 0, it blows up. It has to add any items in that second array into my output. I also tried adding another array tier inside 'skus' array 'skus.data[].field' and it wouldnt handle that. – Rob May 13 '15 at 02:15
  • @Rob you can't do this, because then the expression is ambiguous. I think what you want can be simply written as `outputmodel.items[].sku=items[].skus` then it should work. What I tried to explain in the first part. Check it: https://jsfiddle.net/d7by0ywy/1/ – maraca May 13 '15 at 11:30
  • Maraca, I just modified your alert to stringify output and as you can see the output is malformed. It definitely looks like your close, but has small issues. https://jsfiddle.net/d7by0ywy/4/ – Rob May 13 '15 at 20:16
  • @Rob are you only dealing with 2 objects? like outputModel and items and you know those in advance? could present a very nice solution then – maraca May 13 '15 at 20:28
  • There is always a single source object or array and always a single output obj or array. However what those objects look like the code has no concept of until its passed in. It has to map the two objects just based on the two path strings – Rob May 13 '15 at 21:40
  • @Rob new generalized solution should cover it all, see JFiddle and new answer. This time i'm using recursion. – maraca May 13 '15 at 22:45
  • sorry maraca, no dice... If you inspect the object it way off as it has several nested arrays. When I stringify, it is represented as all empty arrays. I also noticed that the out path was modified to `out.items[].sku[].num` where it should be `out.items[].sku`. We are transforming into flat obj. I can promise you I am more frustrated than anyone on this. Thanks for the effort. https://jsfiddle.net/uxdney3L/2/ – Rob May 13 '15 at 23:13
  • @Rob, just some typos, see here the working version, will update answer https://jsfiddle.net/uxdney3L/3/ ... done – maraca May 13 '15 at 23:52
  • @Rob If you are just looking for a solution... why don't you create a function doing the transformation? If depending on what you receive you need another rule, then you can have an array of functions and a dispatcher who decides which rule has to be applied and calls this function. – maraca May 14 '15 at 23:31
  • I found a library called JSONPATH that looks to do just that. However it is not smart enough alter upstream arrays if it finds nested arrays like my examples. – Rob May 14 '15 at 23:55
  • @Rob, my solution is now really generalized, mentioned restrictions and added example of ambiguity... I think there is nothing more I can do except if we come up with another grammar. – maraca May 15 '15 at 00:00
0

As mentioned in the comments, there is no strict definition of the input format, it is hard to do it with perfect error handling and handle all corner cases.

Here is my lengthy implementation that works on your sample, but might fail for some other cases:

function merge_objects(a, b) {
    var c = {}, attr;
    for (attr in a) { c[attr] = a[attr]; }
    for (attr in b) { c[attr] = b[attr]; }
    return c;
}


var id = {
    inner: null,
    name: "id",
    repr: "id",
    type: "map",
    exec: function (input) { return input; }
};

// set output field
function f(outp, mapper) {
    mapper = typeof mapper !== "undefined" ? mapper : id;
    var repr = "f("+outp+","+mapper.repr+")";
    var name = "f("+outp;
    return {
        inner: mapper,
        name: name,
        repr: repr,
        type: "map",
        clone: function(mapper) { return f(outp, mapper); },
        exec:
        function (input) {
            var out = {};
            out[outp] = mapper.exec(input);
            return out;
        }
    };
}

// set input field
function p(inp, mapper) {
    var repr = "p("+inp+","+mapper.repr+")";
    var name = "p("+inp;
    return {
        inner: mapper,
        name: name,
        repr: repr,
        type: mapper.type,
        clone: function(mapper) { return p(inp, mapper); },
        exec: function (input) {
            return mapper.exec(input[inp]);
        }
    };
}

// process array
function arr(mapper) {
    var repr = "arr("+mapper.repr+")";
    return {
        inner: mapper,
        name: "arr",
        repr: repr,
        type: mapper.type,
        clone: function(mapper) { return arr(mapper); },
        exec: function (input) {
            var out = [];
            for (var i=0; i<input.length; i++) {
                out.push(mapper.exec(input[i]));
            }
            return out;
        }
    };
}

function combine(m1, m2) {
    var type = (m1.type == "flatmap" || m2.type == "flatmap") ? "flatmap" : "map";
    var repr = "combine("+m1.repr+","+m2.repr+")";
    return {
        inner: null,
        repr: repr,
        type: type,
        name: "combine",
        exec:
        function (input) {
            var out1 = m1.exec(input);
            var out2 = m2.exec(input);
            var out, i, j;


            if (m1.type == "flatmap" && m2.type == "flatmap") {
                out = [];
                for (i=0; i<out1.length; i++) {
                    for (j=0; j<out2.length; j++) {
                        out.push(merge_objects(out1[i], out2[j]));
                    }
                }
                return out;
            }

            if (m1.type == "flatmap" && m2.type != "flatmap") {
                out = [];
                for (i=0; i<out1.length; i++) {
                    out.push(merge_objects(out1[i], out2));
                }
                return out;
            }

            if (m1.type != "flatmap" && m2.type == "flatmap") {
                out = [];
                for (i=0; i<out2.length; i++) {
                    out.push(merge_objects(out2[i], out1));
                }
                return out;
            }

            return merge_objects(out1, out2);
        }
    };
}

function flatmap(mapper) {
    var repr = "flatmap("+mapper.repr+")";
    return {
        inner: mapper,
        repr: repr,
        type: "flatmap",
        name: "flatmap",
        clone: function(mapper) { return flatmap(mapper); },
        exec:
        function (input) {
            var out = [];
            for (var i=0; i<input.length; i++) {
                out.push(mapper.exec(input[i]));
            }
            return out;
        }
    };
}



function split(s, t) {
    var i = s.indexOf(t);

    if (i == -1) return null;
    else {
        return [s.slice(0, i), s.slice(i+2, s.length)];
    }
}

function compile_one(inr, outr) {
    inr = (inr.charAt(0) == ".") ? inr.slice(1, inr.length) : inr;
    outr = (outr.charAt(0) == ".") ? outr.slice(1, outr.length) : outr;

    var box = split(inr, "[]");
    var box2 = split(outr, "[]");
    var m, ps, fs, i, j;

    if (box == null && box2 == null) { // no array!
        m = id;

        ps = inr.split(".");
        fs = outr.split(".");

        for (i=0; i<fs.length; i++) { m = f(fs[i], m); }
        for (j=0; j<ps.length; j++) { m = p(ps[j], m); }

        return m;
    }

    if (box != null && box2 != null) { // array on both sides
        m = arr(compile_one(box[1], box2[1]));

        ps = box[0].split(".");
        fs = box[0].split(".");

        for (i=0; i<fs.length; i++) { m = f(fs[i], m); }
        for (j=0; j<ps.length; j++) { m = p(ps[j], m); }

        return m;
    }

    if (box != null && box2 == null) { // flatmap
        m = flatmap(compile_one(box[1], outr));

        ps = box[0].split(".");

        for (j=0; j<ps.length; j++) { m = p(ps[j], m); }

        return m;
    }

    return null;
}

function merge_rules(m1, m2) {
    if (m1 == null) return m2;
    if (m2 == null) return m1;

    if (m1.name == m2.name && m1.inner != null) {
        return m1.clone(merge_rules(m1.inner, m2.inner));
    } else {
        return combine(m1, m2);
    }

}

var input = {
    store: "myStore",
    items: [
        {name: "Hammer", skus:[{num:"12345qwert"}]},
        {name: "Bike", skus:[{num:"asdfghhj"}, {num:"zxcvbn"}]},
        {name: "Fork", skus:[{num:"0987dfgh"}]}
    ]
};

var m1 = compile_one("items[].name", "items[].name");
var m2 = compile_one("items[].skus[].num", "items[].sku");
var m3 = compile_one("store", "storeName");
var m4 = merge_rules(m3,merge_rules(m1, m2));
var out = m4.exec(input);


alert(JSON.stringify(out));
Tienan Ren
  • 266
  • 1
  • 4
  • The output is accurate and handled a few model alteration I tested. I did have trouble with merge_rules as it only takes two rules and when I altered one to be like this compile_one("store", "storeName"); it wouldnt handle it. This gives me something that works that I can start working with. I am marking your answer as accepted. Thanks a ton!!! If you have any ideas on how to make merge handle any number of path sets or why it broke when I altered it would be much appreciated. https://jsfiddle.net/ads4kubf/ – Rob May 15 '15 at 01:49
  • @Rob Fixed the original code to work for your case, the cause is that I should reset the mapper type to normal after putting the output into a field. Now your changes should work. I will check tomorrow to see if it could work for more than 2 rules. – Tienan Ren May 15 '15 at 04:33
0

Well, an interesting problem. Programmatically constructing nested objects from a property accessor string (or the reverse) isn't much of a problem, even doing so with multiple descriptors in parallel. Where it does get complicated are arrays, which require iteration; and that isn't as funny any more when it gets to different levels on setter and getter sides and multiple descriptor strings in parallel.

So first we need to distinguish the array levels of each accessor description in the script, and parse the text:

function parse(script) {
    return script.split(/\s*[;\r\n]+\s*/g).map(function(line) {
        var assignment = line.split(/\s*=\s*/);
        return assignment.length == 2 ? assignment : null; // console.warn ???
    }).filter(Boolean).map(function(as) {
        as = as.map(function(accessor) {
            var parts = accessor.split("[]").map(function(part) {
                return part.split(".");
            });
            for (var i=1; i<parts.length; i++) {
                // assert(parts[i][0] == "")
                var prev = parts[i-1][parts[i-1].length-1];
                parts[i][0] = prev.replace(/s$/, ""); // singular :-)
            }
            return parts;
        });
        if (as[0].length == 1 && as[1].length > 1) // getter contains array but setter does not
            as[0].unshift(["output"]); // implicitly return array (but better throw an error)
        return {setter:as[0], getter:as[1]};
    });
}

With that, the textual input can be made into a usable data structure, and now looks like this:

[{"setter":[["outputModel","items"],["item","name"]],
  "getter":[["items"],["item","name"]]},
 {"setter":[["outputModel","items"],["item","sku"]],
  "getter":[["items"],["item","skus"],["sku","num"]]}]

The getters already transform nicely into nested loops like

for (item of items)
    for (sku of item.skus)
        … sku.num …;

and that's exactly where we are going to. Each of those rules is relatively easy to process, copying properties on objects and iterating array for array, but here comes our most crucial issue: We have multiple rules. The basic solution when we deal with iterating multiple arrays is to create their cartesian product and this is indeed what we will need. However, we want to restrict this a lot - instead of creating every combination of all names and all nums in the input, we want to group them by the item that they come from.

To do so, we'll build some kind of prefix tree for our output structure that'll contain generators of objects, each of those recursivley being a tree for the respective output substructure again.

function multiGroupBy(arr, by) {
    return arr.reduce(function(res, x) {
        var p = by(x);
        (res[p] || (res[p] = [])).push(x);
        return res;
    }, {});
}
function group(rules) {
    var paths = multiGroupBy(rules, function(rule) {
        return rule.setter[0].slice(1).join(".");
    });
    var res = [];
    for (var path in paths) {
        var pathrules = paths[path],
            array = [];
        for (var i=0; i<pathrules.length; i++) {
            var rule = pathrules[i];
            var comb = 1 + rule.getter.length - rule.setter.length;
            if (rule.setter.length > 1) // its an array
                array.push({
                    generator: rule.getter.slice(0, comb),
                    next: {
                        setter: rule.setter.slice(1),
                        getter: rule.getter.slice(comb)
                    }
                })
            else if (rule.getter.length == 1 && i==0)
                res.push({
                    set: rule.setter[0],
                    get: rule.getter[0]
                });
            else
                console.error("invalid:", rule);
        }
        if (array.length)
            res.push({
                set: pathrules[0].setter[0],
                cross: product(array)
            });
    }
    return res;
}
function product(pathsetters) {
    var groups = multiGroupBy(pathsetters, function(pathsetter) {
        return pathsetter.generator[0].slice(1).join(".");
    });
    var res = [];
    for (var genstart in groups) {
        var creators = groups[genstart],
            nexts = [],
            nests = [];
        for (var i=0; i<creators.length; i++) {
            if (creators[i].generator.length == 1)
                nexts.push(creators[i].next);
            else
                nests.push({path:creators[i].path, generator: creators[i].generator.slice(1), next:creators[i].next});
        }
        res.push({
            get: creators[0].generator[0],
            cross: group(nexts).concat(product(nests))
        });
    }
    return res;
}

Now, our ruleset group(parse(script)) looks like this:

[{
    "set": ["outputModel","items"],
    "cross": [{
        "get": ["items"],
        "cross": [{
            "set": ["item","name"],
            "get": ["item","name"]
        }, {
            "get": ["item","skus"],
            "cross": [{
                "set": ["item","sku"],
                "get": ["sku","num"]
            }]
        }]
    }]
}]

and that is a structure we can actually work with, as it now clearly conveys the intention on how to match together all those nested arrays and the objects within them. Let's dynamically interpret this, building an output for a given input:

function transform(structure, input, output) {
    for (var i=0; i<structure.length; i++) {
        output = assign(output, structure[i].set.slice(1), getValue(structure[i], input));
    }
    return output;
}
function retrieve(val, props) {
    return props.reduce(function(o, p) { return o[p]; }, val);
}
function assign(obj, props, val) {
    if (!obj)
        if (!props.length) return val;
        else obj = {};
    for (var j=0, o=obj; j<props.length-1 && o!=null && o[props[j]]; o=o[props[j++]]);
    obj[props[j]] = props.slice(j+1).reduceRight(function(val, p) {
        var o = {};
        o[p] = val;
        return o;
    }, val);
    return obj;
}
function getValue(descriptor, input) {
    if (descriptor.get) // && !cross
        return retrieve(input, descriptor.get.slice(1));
    var arr = [];
    descriptor.cross.reduce(function horror(next, d) {
        if (descriptor.set)
            return function (inp, cb) {
                next(inp, function(res){
                    cb(assign(res, d.set.slice(1), getValue(d, inp)));
                });
            };
        else // its a crosser
            return function(inp, cb) {
                var g = retrieve(inp, d.get.slice(1)),
                    e = d.cross.reduce(horror, next)
                for (var i=0; i<g.length; i++)
                    e(g[i], cb);
            };
    }, function innermost(inp, cb) {
        cb(); // start to create an item
    })(input, function(res) {
        arr.push(res); // store the item
    });
    return arr;
}

And this does indeed work with

var result = transform(group(parse(script)), items); // your expected result

But we can do better, and much more performant:

function compile(structure) {
    function make(descriptor) {
        if (descriptor.get)
            return {inputName: descriptor.get[0], output: descriptor.get.join(".") };

        var outputName = descriptor.set[descriptor.set.length-1];
        var loops = descriptor.cross.reduce(function horror(next, descriptor) {
            if (descriptor.set)
                return function(it, cb) {
                    return next(it, function(res){
                        res.push(descriptor)
                        return cb(res);
                    });
                };
            else // its a crosser
                return function(it, cb) {
                    var arrName = descriptor.get[descriptor.get.length-1],
                        itName = String.fromCharCode(it);
                    var inner = descriptor.cross.reduce(horror, next)(it+1, cb);
                    return {
                        inputName: descriptor.get[0],
                        statement:  (descriptor.get.length>1 ? "var "+arrName+" = "+descriptor.get.join(".")+";\n" : "")+
                                    "for (var "+itName+" = 0; "+itName+" < "+arrName+".length; "+itName+"++) {\n"+
                                    "var "+inner.inputName+" = "+arrName+"["+itName+"];\n"+
                                    inner.statement+
                                    "}\n"
                    };
                };
        }, function(_, cb) {
            return cb([]);
        })(105, function(res) {
            var item = joinSetters(res);
            return {
                inputName: item.inputName,
                statement: (item.statement||"")+outputName+".push("+item.output+");\n"
            };
        });
        return {
            statement: "var "+outputName+" = [];\n"+loops.statement,
            output: outputName,
            inputName: loops.inputName
        };
    }
    function joinSetters(descriptors) {
        if (descriptors.length == 1 && descriptors[0].set.length == 1)
            return make(descriptors[0]);
        var paths = multiGroupBy(descriptors, function(d){ return d.set[1] || console.error("multiple assignments on "+d.set[0], d); });
        var statements = [],
            inputName;
        var props = Object.keys(paths).map(function(p) {
            var d = joinSetters(paths[p].map(function(d) {
                var names = d.set.slice(1);
                names[0] = d.set[0]+"_"+names[0];
                return {set:names, get:d.get, cross:d.cross};
            }));
            inputName = d.inputName;
            if (d.statement)
                statements.push(d.statement)
            return JSON.stringify(p) + ": " + d.output;
        });
        return {
            inputName: inputName,
            statement: statements.join(""),
            output: "{"+props.join(",")+"}"
        };
    }
    var code = joinSetters(structure);
    return new Function(code.inputName, code.statement+"return "+code.output+";");
}

So here is what you will get in the end:

> var example = compile(group(parse("outputModel.items[].name = items[].name;outputModel.items[].sku = items[].skus[].num;")))
function(items) {
    var outputModel_items = []; 
    for (var i = 0; i < items.length; i++) {
        var item = items[i];
        var skus = item.skus;
        for (var j = 0; j < skus.length; j++) {
            var sku = skus[j];
            outputModel_items.push({"name": item.name,"sku": sku.num});
        }
    }
    return {"items": outputModel_items};
}
> var flatten = compile(group(parse("as[]=bss[][]")))
function(bss) {
    var as = []; 
    for (var i = 0; i < bss.length; i++) {
        var bs = bss[i];
        for (var j = 0; j < bs.length; j++) {
            var b = bs[j];
            as.push(b);
        }
    }
    return as;
}
> var parallelRecords = compile(group(parse("x.as[]=y[].a; x.bs[]=y[].b")))
function(y) {
    var x_as = []; 
    for (var i = 0; i < y.length; i++) {
        var y = y[i];
        x_as.push(y.a);
    }
    var x_bs = []; 
    for (var i = 0; i < y.length; i++) {
        var y = y[i];
        x_bs.push(y.b);
    }
    return {"as": x_as,"bs": x_bs};
}

And now you can easily pass your input data to that dynamically created function and it will be transformed quite fast :-)

Community
  • 1
  • 1
Bergi
  • 630,263
  • 148
  • 957
  • 1,375