How to loop trough a large object that has child objects and arrays?

Question

I am developing an application where I need to implement simple search functionality, so I have this large object with child objects and arrays. Usually I access data in that object like this:

list[id][day][show].title

but now I need to check if that title is equal to some input value, so I created this function:

 getSimilarShows = (allShows, expectedShow) => {
      const titles = []
      Object.values(Object.values(allShows)).map((days) =>
        Object.values(days).map((items) =>
          Object.values(items).map((show) => {
              if (show.title === expectedShow) {
                titles.push(show.title)
              }
          })
        )
      )
    }

This gives me an array of titles but I also need the id, the day and the show saved in that array.

Here is the data example:

{
1: {29: [{0: {id: 0000, title: 'some title'}, 
         {1: {id: 0000, title: 'some title'},
         ...], 
    30: [{0: {id: 0000, title: 'some title'}, 
         {1: {id: 0000, title: 'some title'},
         ...],
   ...}, 
6: {29: [{0: {id: 0000, title: 'some title'}, 
         {1: {id: 0000, title: 'some title'},
         ...], 
    30: [{0: {id: 0000, title: 'some title'}, 
         {1: {id: 0000, title: 'some title'},
         ...],
   ...},  
...}

How to properly save them?

show.id is not what I need. I need to push object keys (1 or 6, 29 or 30, 0 or 1) to that array, so basically if the title is equal to some value it has to trace the whole path to that title. — Gido, Sep 30 '20 at 10:59

Scott Sauyet · Accepted Answer · 2020-09-30T18:40:57.290

Your data structure is not really recursive. Not only does each level represent a different sort of value (some sort of group, a day, perhaps an event) but your structure is not consistent at different levels. (Why the arrays in the middle of the hierarchy?)

So recursive processing won't do here. But we can traverse the structure in a fairly clear manner with something like this:

const getSimilarShows = (shows, title) => 
  Object .entries (shows)
    .flatMap (([group, days]) =>
      Object .entries (days)
        .flatMap (([day, events]) => 
          events.flatMap ((ev) => 
            Object .entries (ev) 
              .filter (([_, {title: t}]) => t === title)
              .map (([event, {title, ...rest}]) => ({group, day, event, title, ...rest}))
          )
        )
    )


const shows = {
  1: {
    29: [
      {0: {id: '0001', title: 'title a'}},
      {1: {id: '0002', title: 'title b'}},
    ],
    30: [
      {0: {id: '0003', title: 'title c'}},
      {1: {id: '0004', title: 'title a'}},
    ]
  },
  6: {
    29: [
      {0: {id: '0005', title: 'title d'}},
      {1: {id: '0006', title: 'title b'}},
    ],
    30: [
      {0: {id: '0007', title: 'title a'}},
      {1: {id: '0008', title: 'title c'}},
    ]
  }
}

console .log (
  getSimilarShows (shows, 'title a')
)

.as-console-wrapper {max-height: 100% !important; top: 0}

I rarely like code that is nested so deeply. But my first approach started with getSimilarShows calling getDays calling getEvents, and at each level I had to map the results back into an object with the level key found (group, day, event.) It was much more code and still no more clear than this version.

Speaking of those group keys, I had to make them up. I don't know what the outermost 1 and 6, which I called group, represent, nor the repeated inner 0 and 1, which I called event. I'm pretty sure that 29 and 30 were supposed to represent days. So you may need to change those properties and the corresponding variables.

There is also a level I did not name. I don't particularly understand the structure inside, say, 29 or 30. Why is there an array of single integer-keyed properties in there, rather than an object like the higher levels? I didn't include this index in the result. But if you need it, this line:

          events.flatMap ((ev) =>

could just become

          events.flatMap ((ev, index) =>

and you could add index to the returned object.

If you can, though, I would recommend looking into whether that array is even necessary.

I'm so happy you decided to tackle this one. I wanted to help out but there were so many issues with the question I thought I might never get to the bottom of what I wanted to say. This says all of that and more. — Mulan, Sep 30 '20 at 18:26
I suspected the `group` keys were actually `month` numbers... but I couldn't make sense of the inner most `{ 0: ... }` and `{ 1: ... }` objects. These misuses of data structures by OP makes it very difficult to understand the true intention, and more often than not, when you suggest an alternative structure it is met with push-back. — Mulan, Sep 30 '20 at 18:28
You answer inspired a function I've never written but is surprisingly useful :D — Mulan, Sep 30 '20 at 19:03
Thank you for explaining and an example. Your solution seems to be the fastest one. The data structure is as follows list[channelID][dayOfTheMonth][activeElement] — Gido, Oct 01 '20 at 08:43
@Gido: I would ask one more question, though. Is the answer from Thankyou *fast enough*? Because it's a much nicer solution. I'm happy just that mine served as an inspiration for that elegant solution. Of course mine is faster; custom code will nearly always be faster than generic code. But if the performance of that generic code is good enough, I would stick with it. — Scott Sauyet, Oct 01 '20 at 13:12
@ScottSauyet well the reason I need better performace is that it is for smart TV application. Older tvs are very slow. — Gido, Oct 01 '20 at 13:26
@Gido: It's obviously your call here. And there are times when performance has to trump most everything else. But you wouldn't normally choose a 300-line convoluted solution over a simple 3-line one to save one microsecond from a one-second process. So you need criteria to use in making such decisions. To me, then the question is about whether the performance meets your needs. — Scott Sauyet, Oct 01 '20 at 14:22

StepUp · Answer 2 · 2020-09-30T14:54:23.480

We can use Object.entries() method to get keys and its values and then just filter them based on yourcondition:

const getArrayFromObject = (obj) => {
        let items = [];
        Object.entries(obj)
            .forEach(([k, v])=> Object.entries(v).forEach(([k1, v1]) =>
                v1.forEach(item => item.hasOwnProperty('0') ? items.push({ id: item[0].id, day: +k1, title: item[0].title, show: 0 }) :
                    items.push({ id: item[1].id, day: +k1, title: item[1].title, show: 1 }) )));
        return items;
    }

An example:

const obj = {
    1: {29: [
            { 0: {id: 0001, title: 'some title1'}},
            { 1: {id: 0002, title: 'some title2'}},
        ],
        30: [{0: {id: 0000, title: 'some title'}},
             {1: {id: 0000, title: 'some title'}},
             ],
       },
    6: {29: [{0: {id: 0000, title: 'some title'}},
             {1: {id: 0000, title: 'some title'}},
             ],
        30: [{0: {id: 0000, title: 'some title'}},
             {1: {id: 0000, title: 'some title'}},
             ],
       },
    };


const getArrayFromObject = (obj) => {
    let items = [];
    Object.entries(obj)
        .forEach(([k, v])=> Object.entries(v).forEach(([k1, v1]) =>
            v1.forEach(item => item.hasOwnProperty('0') ? items.push({ id: item[0].id, day: +k1, title: item[0].title, show: 0 }) :
                items.push({ id: item[1].id, day: +k1, title: item[1].title, show: 1 }) )));
    return items;
}

const result = getArrayFromObject(obj).filter(f => f.id == 1 && f.title == 'some title1');
console.log(result);

Or using recursive approach it is possible to get all arrays from the object and then just filter it by desired keys:

const items = [];
const getArrayFromObject = obj => {
    for (var k in obj)
    {
        if (typeof obj[k] == "object" && obj[k] !== null)
            getArrayFromObject(obj[k]);
        else
            items.push(obj);
    }
}

getArrayFromObject(obj);
let result = items.filter(f => f.id == 1 && f.title == 'some title1');

An example:

const obj = {
    1: {29: [
            { 0: {id: 0001, title: 'some title1'}},
            { 1: {id: 0002, title: 'some title2'}},
        ],
        30: [{0: {id: 0000, title: 'some title'}},
             {1: {id: 0000, title: 'some title'}},
             ],
       },
    6: {29: [{0: {id: 0000, title: 'some title'}},
             {1: {id: 0000, title: 'some title'}},
             ],
        30: [{0: {id: 0000, title: 'some title'}},
             {1: {id: 0000, title: 'some title'}},
             ],
       },
    };

const items = [];
const getArrayFromObject = obj => {
    for (var k in obj)
    {
        if (typeof obj[k] == "object" && obj[k] !== null)
            getArrayFromObject(obj[k]);
        else
            items.push(obj);
    }
}

getArrayFromObject(obj);
let result = items.filter(f => f.id == 1 && f.title == 'some title1');

console.log(result)

If we want to stick with the above approach and we want to get their keys, than we can use the following approach:

const obj = {
1: {29: [
        { 0: {id: 0001, title: 'some title1'}},
        { 1: {id: 0002, title: 'some title2'}},
    ],
    30: [{0: {id: 0000, title: 'some title'}},
         {1: {id: 0000, title: 'some title'}},
         ],
   },
6: {29: [{0: {id: 0000, title: 'some title'}},
         {1: {id: 0000, title: 'some title'}},
         ],
    30: [{0: {id: 0000, title: 'some title'}},
         {1: {id: 0000, title: 'some title'}},
         ],
   },
};

let items = [];

const getArrayFromObject = (obj, keys) => {
    for (var k in obj)
    {
        if (typeof obj[k] == "object" && obj[k] !== null)
            getArrayFromObject(obj[k], keys ? `${keys}, ${k}` : k);
        else
            items.push({...obj, keys});
    }
}

getArrayFromObject(obj);
let uniqueItems = items.filter((f, index, self) =>
    index === self.findIndex((t) => (
        t.id === f.id && t.title === f.title
  )));

uniqueItems = uniqueItems.map(s => ({id: s.id, day: +(s.keys.split(',')[1]), show: +(s.keys.split(',')[2]), title: s.title }));
console.log(uniqueItems)

Okay, so it is a good example of recursive function but how do I save object keys together with the title? From the above example I need a new object = {id: 1, day: 29, show: 0, title: "some title1"}? — Gido, Sep 30 '20 at 12:52

Mulan · Answer 3 · 2020-11-09T14:50:42.223

@Scott has paid you a tremendous favour to explain the issues with your proposed data shape and program. He is correct that recursion is not a particular great fit for this problem. His answer did spark an idea though and I will share it below.

Here we have collapse which allows you to collapse an object of arbitrary shape using a variable-length sequence of named keys -

if name is empty, the base case has been reached. Combine intermediate result, r, and input, t
(inductive) name is not empty. Collapse input t and recur with smaller sub-problem

const collapse = ([ name, ...more ], t = {}, r = {}) =>
  name === undefined
    ? [ { ...r, ...t } ]    // 1
    : Object                // 2
        .entries(t)
        .flatMap
          ( ([ k, v ]) =>
              collapse(more, v, { ...r, [name]: k }) // <- recursion
          )

const result =
  collapse(["month", "day", "event", "_"], shows)

console.log(JSON.stringify(result, null, 2))

[ { "month": "1", "day": "29", "event": "0", "_": "0", "id": "0001", "title": "title a" }
, { "month": "1", "day": "29", "event": "1", "_": "1", "id": "0002", "title": "title b" }
, { "month": "1", "day": "30", "event": "0", "_": "0", "id": "0003", "title": "title c" }
, { "month": "1", "day": "30", "event": "1", "_": "1", "id": "0004", "title": "title a" }
, { "month": "6", "day": "29", "event": "0", "_": "0", "id": "0005", "title": "title d" }
, { "month": "6", "day": "29", "event": "1", "_": "1", "id": "0006", "title": "title b" }
, { "month": "6", "day": "30", "event": "0", "_": "0", "id": "0007", "title": "title a" }
, { "month": "6", "day": "30", "event": "1", "_": "1", "id": "0008", "title": "title c" }
]

Writing getSimilarShows is easier now thanks to collapse -

const getSimilarShows = (shows = [], query = "") =>
  collapse(["month", "day", "event", "_"], shows) // <-
    .filter(v => v.title === query)

const result =
  getSimilarShows(shows, "title b")

console.log(JSON.stringify(result, null, 2))

[ { "month": "1", "day": "29", "event": "1", "_": "1", "id": "0002", "title": "title b" }
, { "month": "6", "day": "29", "event": "1", "_": "1", "id": "0006", "title": "title b" }
]

caution

NB collapse is somewhat reckless and does not protect you from trying to collapse an object more than is possible. For example, if you provide four (4) named keys but the object is only nested two (2) levels deep, an empty result, [], will be returned. This is likely unexpected and it would be better to throw a runtime error in this case.

One obvious improvement would be the ability to "skip" a level using a known name, such as "_" above -

const collapse = ([ name, ...more ], t = {}, r = {}) =>
  name === undefined
    ? [ { ...r, ...t } ]
    : Object
        .entries(t)
        .flatMap
          ( ([ k, v ]) =>
              name === "_"  // <- skip this level?
                ? collapse(more, v, r)  // <- new behaviour
                : collapse(more, v, { ...r, [name]: k }) // <- original
          )

const result =
  collapse(["month", "day", "event", "_"], shows)

console.log(JSON.stringify(result, null, 2))

With this update "_" keys do not appear in the output below -

[ { "month": "1", "day": "29", "event": "0", "id": "0001", "title": "title a" }
, { "month": "1", "day": "29", "event": "1", "id": "0002", "title": "title b" }
, { "month": "1", "day": "30", "event": "0", "id": "0003", "title": "title c" }
, { "month": "1", "day": "30", "event": "1", "id": "0004", "title": "title a" }
, { "month": "6", "day": "29", "event": "0", "id": "0005", "title": "title d" }
, { "month": "6", "day": "29", "event": "1", "id": "0006", "title": "title b" }
, { "month": "6", "day": "30", "event": "0", "id": "0007", "title": "title a" }
, { "month": "6", "day": "30", "event": "1", "id": "0008", "title": "title c" }
]

@Scott offers an excellent suggestion to use a native Symbol instead or a string-based key. Eyes on collapse.skip below -

const collapse = (...) =>
  name === undefined
    ? //...
    : Object
        .entries(t)
        .flatMap
          ( ([ k, v ]) =>
              name === collapse.skip // <- known symbol
                ? //...
                : //...
          )

collapse.skip = // <- define symbol
  Symbol("skip")

Now instead of giving special behaviour to "_", we use collapse.skip. To keep examples consistent, we only skip one level of nesting, but we could effectively skip any number of levels we wish -

const result =
  collapse(["month", "day", "event", collapse.skip], shows) // <-

console.log(JSON.stringify(result, null, 2))
// ...

alternative implementation

I've been spending some time thinking about collapse a little bit and I wonder how modifying the call site can increase its utility -

function collapse (t = {}, ...f)
{ function loop (t, c, r)
  { if (c >= f.length)
      return [ { ...r, ...t } ]
    else
      return Object
        .entries(t)
        .flatMap(([ k, v ]) => loop(v, c + 1, { ...r, ...f[c](k) }))
  }

  return loop(t, 0, {})
}

const shows =
  {1:{29:[{0:{id:'0001',title:'title a'}},{1:{id:'0002',title:'title b'}}],30:[{0:{id:'0003',title:'title c'}},{1:{id:'0004',title:'title a'}}]},6:{29:[{0:{id:'0005',title:'title d'}},{1:{id:'0006',title:'title b'}}],30:[{0:{id:'0007',title:'title a'}},{1:{id:'0008',title:'title c'}}]}}

const result =
  collapse
    ( shows
    , v => ({ month: v })
    , v => ({ day: v })
    , v => ({ event: v })
    , v => ({})             // <- "skip"
    )

console.log(JSON.stringify(result, null, 2))

list-like array destructuring

Thinking about array indexes is painful though, I agree with @Scott's comment below. But destructuring with rest arguments can create a lot of intermediate values. Here's one technique, likeList, I've been toying around with that seems to have good ergonomics and memory footprint -

const likeList = (t = [], c = 0) =>
  ({ [Symbol.iterator]: function* () { yield t[c]; yield likeList(t, c + 1) } })

function collapse (t = {}, ...f)
{ function loop (t, [f, fs], r) // <- destructure without rest
  { if (f === undefined)        // <- base case: no f
      return [ { ...r, ...t } ]
    else
      return Object
        .entries(t)
        .flatMap(([ k, v ]) => loop(v, fs, { ...r, ...f(k) })) // <- f
  }

  return loop(t, likeList(f), {}) // <- likeList
}

Or possibly -

const likeList = (t = [], c = 0) =>
  ({ [Symbol.iterator]: _ => [ t[c], likeList(t, c + 1) ].values() })

hold onto performance

I'm a big advocate for functional style because it unlocks our ability to think about problems in a fundamentally different way. JavaScript is very friendly to functional programmers, but it comes with caveats. Use of certain features in particular ways can slow our programs down, sometimes causing us to think functional style itself is to blame.

It's my own personal hobby to explore new ways to express functional style programs that don't take big performance hits. Above likeList offers a solution. Below we will put it to the test as we compare four (4) programs that copy an array. Each program is identical except for the way it iterates through the input array.

Here is copy by destructuring with rest argument. An elegant form enabled by JavaScript's native destructuring syntax. However it is costly, as we will see later -

const copyDestructure = (arr) =>
  loop
    ( ( [ x, ...xs ] = arr    // <- rest argument
      , r = []
      ) =>
        x === undefined
          ? r
          : recur(xs, push(r, x))
    )

Here is copy using a numeric index. This trades destructuring syntax for a cheap index. But now the programmer is burdened thinking about array boundaries, intermediate state, and off-by-one errors -

const copyIndex = (arr) =>
  loop
    ( ( i = 0        // <- index
      , r = []
      ) =>
        i >= arr.length      // <- off-by-one?
          ? r
          : recur(i + 1, push(r, arr[i]))  // <- increment i
    )

Here is copy using likeList. This uses destructuring syntax but without an expensive rest argument. We removed all of the negative concerns of using an index, but can we maintain a good performance? -

const copyLikeList = (arr) =>
  loop
    ( ( [ x, xs ] = likeList(arr) // <- likeList
      , r = []
      ) =>
        x === undefined
          ? r
          : recur(xs, push(r, x)) // <- plainly use x and xs
    )

And copy by listList, using the alternative implementation -

const copyLikeList2 = (arr) =>
  loop
    ( ( [ x, xs ] = likeList2(arr)  // <- implementation 2
      , r = []
      ) =>
        x === undefined
          ? r
          : recur(xs, push(r, x))   // <- same
    )

Run time in milliseconds, lower is better -

Array size         100    1,000    10,000     100,000
-----------------------------------------------------
copyDestructure   3.30    19.23     482.3    97,233.5
copyIndex         0.47     5.92      20.9       165.1 <-
copyLikeList      1.18     9.31      55.6       479.2
copyLikeList2     0.79     7.90      33.6       172.4 <-

Memory used in KB, lower is better -

Array size                1,000               100,000
-----------------------------------------------------
copyDestructure          613.43             38,790.34
copyIndex                247.60              4,133.72 <-
copyLikeList             960.44             26,885.91
copyLikeList2            233.63              2,941.98 <-

Implementation -

// Arr.js

const likeList = (t = [], c = 0) =>
  ({ [Symbol.iterator]: function* () { yield t[c]; yield likeList(t, c + 1) } })

const likeList2 = (t = [], c = 0) =>
  ({ [Symbol.iterator]: _ => [ t[c], likeList2(t, c + 1) ].values() })

const push = (t = [], x) =>
  ( t.push(x)
  , t
  )

const range = (start = 0, end = 0) =>
  Array.from(Array(end - start), (_, n) => n + start)

export { likeList, likeList2, push, range }

// TailRec.js

function loop (f, ...init)
{ let r = f(...init)
  while (r && r.recur === recur)
    r = f(...r)
  return r
}

const recur = (...v) =>
  ({ recur, [Symbol.iterator]: _ => v.values() })

export { loop, recur }

remarks

copyLikeList2 above using the second implementation of likeList is really onto something. The performance characteristics are comparable to using an index, even on large inputs. copyDestructure is substantially slower even on arrays as small as 1,000 elements.

That is gorgeous! A possible tweak would be to make a Symbol `SKIP` as a property of the `collapse` function, to be used unambiguously where you use `'_'`. — Scott Sauyet, Sep 30 '20 at 19:36
I especially like the consistent handling of arrays and other objects, something I've picked up from other of your answers, but which I still forget to do at times. — Scott Sauyet, Sep 30 '20 at 19:41
I wonder in the alternative implementation why you chose to use an array index rather than recurring on `[f, ...fs]` parameter. My version might [look like this](https://gist.github.com/CrossEye/997085a700538a5c682265f3b2f406a6). — Scott Sauyet, Oct 05 '20 at 16:22
This, by the way, seems like my new hammer; everything looks like a nail! When I went to create my own version of the original API, I had a [slightly different take](https://gist.github.com/CrossEye/9dad62aafc46feedb328be91400de824) than yours. I think the only functional difference is in handling array index values; I force them back to numbers. But it's interesting to see the variations. — Scott Sauyet, Oct 05 '20 at 16:25
Late reply, I only used an index because it seemed a bit wasteful to pick apart what the rest argument just put together, ha! I have been working on some other ways to work with arrays that are both functional style and not wasteful. I’ll share if it’s ever good enough :D — Mulan, Oct 17 '20 at 23:33
@ScottSauyet, I updated the post with an example of what I mean. I'm curious what other ways we could design this. — Mulan, Oct 18 '20 at 00:04
`likeList` is a great idea. I can think of all sorts of times I might want to use it. I'm skeptical about it adding much performance boost. I would expect `{ ...r, ...f(k) }` to dominate any memory and time here. But the idea of treating an array as a lazy pair-based list is quite nice. — Scott Sauyet, Oct 19 '20 at 12:43
I was curious to see the results for myself so I assembled a quick comparison. For this particular problem, I agree that `likeList` doesn't offer a great benefit, but for general handling of lists, the space-time improvements are not insignificant. As for `{ ...r, ...f(k) }` this is a different syntactic form and I think the only way to improve it would be to swap out Object as the base data structure. I made an update to the post, pretty much just for you :D — Mulan, Oct 24 '20 at 19:04
This is impressive. I'm going to have to spend some time soon with `listLike2`. It feels to me more elegant than the first version, *and* it's more efficient. — Scott Sauyet, Oct 26 '20 at 12:45
Thanks for taking the time to read. I'm also working on an optimisation for `[ ...a, ...b, ..., ...z ]`. We'll see how that goes. Still scratching my head on `{ ...a, ...b, ..., ...z }` though. `Symbol.iterator` does not apply and it seems to only be a syntactic sugar around `Object.assign`. Built-in behaviours for native Object are limited :S — Mulan, Oct 26 '20 at 16:17
@ScottSauyet I haven't finished it and I've been pulled aside for other things as of late :( The basic idea behind it is `t = spread(spread(1), 2, spread(), [3,4], spread(5, [6]))` and `toArray(t)` yields `[1,2,3,4,5,6]`. The optimisation is `spread` essentially constructs an AST, rather than copying entire arrays. When `toArray` is called, a single array is created and the flattened AST is copied into it _once_. The same can be done for `toObject`. — Mulan, Nov 10 '20 at 20:06

vincent · Answer 4 · 2021-01-17T05:48:19.063

Big fan of using libraries when it improves maintainability and readability. Here is a solution using object-scan. We use it for most of our data processing related tasks. Powerful once you wrap your head around how to use it.

// const objectScan = require('object-scan');

const extract = (title, data) => objectScan(['*.*[*].*'], {
  filterFn: ({ key, value, context }) => {
    if (value.title === title) {
      const [group, day, _, event] = key;
      context.push({ group, day, event, ...value });
    }
  }
})(data, []);

const shows = { 1: { 29: [{ 0: { id: '0001', title: 'title a' } }, { 1: { id: '0002', title: 'title b' } }], 30: [{ 0: { id: '0003', title: 'title c' } }, { 1: { id: '0004', title: 'title a' } }] }, 6: { 29: [{ 0: { id: '0005', title: 'title d' } }, { 1: { id: '0006', title: 'title b' } }], 30: [{ 0: { id: '0007', title: 'title a' } }, { 1: { id: '0008', title: 'title c' } }] } };

console.log(extract('title a', shows));
/* =>
  [ { group: '6', day: '30', event: '0', id: '0007', title: 'title a' },
    { group: '1', day: '30', event: '1', id: '0004', title: 'title a' },
    { group: '1', day: '29', event: '0', id: '0001', title: 'title a' } ]
*/

.as-console-wrapper {max-height: 100% !important; top: 0}

<script src="https://bundle.run/object-scan@13.8.0"></script>

Disclaimer: I'm the author of object-scan

How to loop trough a large object that has child objects and arrays?

4 Answers4

Linked