1

I am dealing with a couple of arrays of objects which will get rendered into the UI using React. So here's the context of what I am doing. I get different data sets from different APIs. These datasets are arrays of arrays of objects. E.g

[
 [{age: 23, name: john}],
 [{age: 24, name: jane}],
 [{age: 25, name: sam}],
 [{age: 26, name: smith}],
]

With this format, it will be hard to render the result in the UI. So a simple fix will be to flatten the array which returns a single array of objects which then I can map through and display the data on the UI. Now, I have this kind of operation for each of the datasets returned from the API.

  • First, I flatten each array of arrays, using array.flat()
  • Second, I filter duplicated items using array.filter()
  • Third, I sort these items using array.sort()
  • Fourth, I merge the arrays together using array.concat()
  • Fifth, I map through the final array produced from the 4 previous steps.

Following this style I listed above seems to be imperative to me, I wanted to use a more functional approach where I can pipe functions together, which will make it more declarative rather than imperative. Currently, I am stuck on writing a function that will accept arrays as inputs, then flatten the arrays into a single final array. Now I know I can flatten an array using the reduce() method and. I came across a solution on StackOverflow which looks like this and can be seen from this link Flatten Array

const flatten = (arr) => {
    return arr.reduce((flat, toFlatten) => {
        return flat.concat(
            Array.isArray(toFlatten) ? flatten(toFlatten) : toFlatten
        )
    }, [])
}

I was thinking maybe I could spread the input arr parameter but that seems to throw this error

RangeError: Maximum call stack size exceeded

I would appreciate it if I could be guided in the right way and also shown an efficient way to achieve all these operations I listed out. Thank you

thatguy
  • 370
  • 1
  • 5
  • 22
  • "With this format, it will be hard to render the result in the UI" why? That looks incredibly easy to use in any templating context? Use `data.flat()` and then map the content: ```data.flat().map(e => `

    ${e.name} is ${e.age} years old

    `);``` etc? And if you need sorting, remember to pass a sorting function to `sort()` so that you can sort on name or age or any other object property you need.
    – Mike 'Pomax' Kamermans Apr 16 '21 at 19:02
  • i would like to help you achieve the other goals you mentioned but your post needs a bit of clarification. what constitutes a duplicate? object with the same `name` property? could you specify a concrete array of inputs and the expected output? – Mulan Apr 17 '21 at 01:24

6 Answers6

2

This solution should do what you're looking for. It would help to have a few samples of different return data to merge together, but I tried to reproduce what you explained in my solution below.

First I combine all the different source arrays into one single array and then flatten that to any depth any of the source arrays have. In case this depth is variable, I used flat(Infinity) but you could use a finite depth as well (flat(2) or flat(3)). Once I've flattened the array, I briefly convert it to a Set to remove any duplicates and then convert it back to an array.

const source1 = [
  [{age: 23, name: 'john'}],
  [{age: 24, name: 'jane'}],
  [{age: 25, name: 'sam'}],
  [{age: 26, name: 'smith'}]
];

const source2 = [
  [{age: 27, name: 'mike'}],
  [{age: 28, name: 'joanne'}],
  [{age: 29, name: 'lois'}],
  [{age: 30, name: 'paul'}]
];

const allSources = [...new Set([source1,source2].flat(Infinity))];

console.log(allSources);

** Important note here: Because arrays and objects are reference types, even if two source arrays have the same object by key-value pairs, it won't actually be seen as a duplicate. In order to truly filter out any duplicate objects, you'll need to loop through the keys for both objects and compare their associated values, like this:

const source1 = [
  [{age: 23, name: 'john'}],
  [{age: 24, name: 'jane'}],
  [{age: 25, name: 'sam'}],
  [{age: 26, name: 'smith'}]
];

const source2 = [
  [{age: 23, name: 'john'}],
  [{age: 24, name: 'jane'}],
  [{age: 25, name: 'sam'}],
  [{age: 26, name: 'smith'}],
  [{age: 27, name: 'mike'}],
  [{age: 28, name: 'joanne'}],
  [{age: 29, name: 'lois'}],
  [{age: 30, name: 'paul'}]
];

const allSources = [source1,source2].flat(Infinity).filter((obj,_,a) => a.find(o => Object.keys(obj).length === Object.keys(o).length && Object.keys(obj).every(objKey => Object.keys(o).includes(objKey) && obj[objKey] === o[objKey])) === obj);

console.log(allSources);
Brandon McConnell
  • 5,776
  • 1
  • 20
  • 36
  • or just don't specify the depth, `[source1,source2].flat()`, which is the same as `source1.concat(source2)`. or keep style consistent with your array spread, `[...new Set([...source1, ...source2])]`. as for the second half of your answer, linear `every` inside linear `find` inside linear `filter` is very inefficient. if the input is large or the ui needs to redraw often, this could be a problem. see ori's answer to dramatically reduce the computational complexity. – Mulan Apr 17 '21 at 01:21
  • 1
    Sorry to hear you weren’t a fan of my solution, @ThankYou. It performs very well, even when benchmarked against the other solutions to this question. In fact, the other solution you commented on used very similar practices to my own here. However, the logic that solution uses to determines duplicate entries differs from my mine and actually achieves a different purpose. Thanks for the advise though. Happy coding! – Brandon McConnell Apr 17 '21 at 01:32
  • 1
    @ThankYou I just admit Ori’s solution is much more concise. As soon as I’m back at my computer tonight, I’ll give it a more thorough look. Thanks again. – Brandon McConnell Apr 17 '21 at 02:18
  • thanks for taking the feedback constructively. i wouldn't go so far to say "i'm not a fan," i only wanted to point out that even "correct" programs like yours can have hidden issues that rear their heads under different circumstances. All the best :D – Mulan Apr 17 '21 at 03:04
1

You can use Array.flat() to flatten multiple levels of arrays. The default is 1, but you can use Array.flat(Infinity) to flatten multiple levels.

const arr = [[{"age":23,"name":"john"}],[{"age":24,"name":"jane"}],[{"age":25,"name":"sam"}],[{"age":26,"name":"smith"}]]

const result = arr.flat()

console.log(result)

To remove the duplicates, you can now reduce the flattened array to a Map, and then convert it back to an array using the Map.values() iterator ad Array.from():

const arr = [[{"age":23,"name":"john"}],[{"age":24,"name":"jane"}],[{"age":25,"name":"sam"}],[{"age":26,"name":"smith"}], [{"age":33,"name":"john"}]]

const result = Array.from(
  arr.flat()
    .reduce((acc, o) => 
      acc.has(o.name)
        ? acc
        : acc.set(o.name, o)
      , new Map())
    .values()
)

console.log(result)
Ori Drori
  • 183,571
  • 29
  • 224
  • 209
1

This is a non-trivial problem and an opportunity to learn about designing data structures. This is a long-format answer that hopes to give you the tools to make these analyses on your own. This post uses Brandon's well-intentioned answer as the basis of contrast. First we need to examine the properties of some of JavaScript's built-in structures.

performance of array

We will start with Array. If we have an array, input, and a known index, i, we can fetch the element in constant time, input[i]. This means neither the size of input nor i affect how long it will take to get the answer. This is denoted in Big O notation as O(1). This is true because an array is effectively a pointer to a position in memory, and i is an offset. Finding the element is a constant computation of position + i which is always a simple addition.

const input =
  [ "foo", "bar", "cat", "dog", "hat", "ham" ]

const i =
  4

const result =
  input[i]

console.log(result) // "hat"

If however the index is unknown and we wish to find an element by it's value, we must scan the array one element at a time. This is known as linear complexity and denoted as O(n) because the amount of time/space scales linearly with respect to the size of the input, n. With an array of n = 3 elements, we only have to check up to 3 elements to find a match. With an array of n = 1000, we would have to check up to 1,000!

const input =
  [ "foo", "bar", "cat", "dog", "hat", "ham" ]

const query =
  "hat"

const result =
  input.findIndex(v => v == query)

console.log(result) // 4

Almost every Array operation scales with the size of the input array. These linear operations include entries, every, fill, filter, find, findIndex, flat, flatMap, for..of, forEach, from, includes, indexOf, join, keys, lastIndexOf, map, reduce, reduceRight, reverse, slice, some, toString, and values. An important-to-know property of linear operations is that when they are nested, they become quadratic, or O(n2).

An easy way to see this is two nested for loops, O(n) * O(n) = O(n*n). If n = 1000, this could take 1,000,000 computations. This would also apply to other linear operations like includes nested inside filter -

const input =
  [ "foo", "bar", "cat", "dog", "hat", "ham" ]

const result =
  input.filter(v => v.includes("a")) // <- nested linear operations

console.log(result)
// [ "bar", "cat", "hat", "ham" ]

performance of object

Next we will look at JavaScript's Objects. Like Arrays, if we have an object, input, and a known key, query, we can retrieve a value in constant time, O(1) -

const input =
  { a: 1, b: 2, c: 3 }

const query =
  "b"

const result =
  input[b]

console.log(result) // 2

Now say we have two objects, foo and bar, and wish to determine if they are equals. Our function isEqual combines Object.keys, O(n), with every, O(n), and nests includes, O(n), with a property lookup, O(1). This is O(n + n * (n + 1)) which simplifies to O(n2). The includes check here is unnecessary but we following Brandon's example for comparison -

const foo =
  { a: 1, b: 3 }

const bar =
  { b: 3, a: 1 }

const isEqual = (x,y) =>
  Object.keys(x).every(k => y.includes(k) && y[k] === x[k])

console.log(isEqual(foo, bar)) // true

compound performance

For our final example, let's take an array of objects, input, and a query object, and try to find the query's index. We already determined isEqual is a quadratic, O(n2). When we nest this operation inside linear findIndex, we end up with O(n) * O(n2) = O(n3). If n = 1000 this would take 1,000,000,000 computations! As you can see, linear operations can scale in computational complexity very quickly -

const input =
  [ { a: 1, b: 1 }, { a: 1, b: 2 }, { a: 1, b: 3 } ]

const query =
  { b: 2, a: 1 }

function isEqual(x, y) {
  return Object.keys(x).every(k => y.includes(k) && y[k] === x[k])
}

const result =
  input.findIndex(v => isEqual(v, query))

console.log(result) // 1

Now let's analyze Brandon's answer. I don't want anyone to think I'm picking on him. His program is valid and returns the correct result. However as the input size grows, its process can grind your cpu to a halt. If this program were used in a loop or happens every UI refresh, it could be a problem.

const allSources =
  [source1,source2]
      .flat(Infinity)
      .filter((obj,_,a) =>
         a.find(o =>
           Object.keys(obj).length === Object.keys(o).length
             && Object.keys(obj).every(objKey =>
               Object.keys(o).includes(objKey)
                 && obj[objKey] === o[objKey])) === obj)
complexity operation
O(1) .length, ===, &&
O(n) flat,filter,keys,every,includes
computational complexity
O(n + n * (n * (n + 1 + 1 + n + 1 + n + n * (n + n + 1 + 1 + 1 + 1))))
= O(n + n * (n * (n + 1 + 1 + n + 1 + n + n * (n))))
= O(n + n * (n * (n + 1 + 1 + n + 1 + n + n2)))
= O(n + n * (n3))
= O(n4)

my first data structure

Above the filter(...) algorithm has O(n4) complexity. For an input size of n = 1000, that is an estimated 1,000,000,000,000 (one trillion) operations! If we want a faster solution, we cannot rely on Array's linear operations, but JavaScript does not include a data type that does exactly what we need. Maybe you never designed your own data structure before, maybe you didn't know that was a possibility, that's okay!

We will now look at Map which has very similar properties to Object. The primary difference is that an object's keys must be a string whereas Map can associate any key type with any value. And like Array and Object, we can nest Map any amount of levels, if desired. We will use get sequenced with get, to lookup a nested value in constant time. This is denoted as O(1) + O(1) = O(2) and simplified to O(1) -

const key1 =
  { a: "ay" }      // <- any type for key1

const key2 =
  [ "bee" ]        // <- any type for key2

const someValue =
  Symbol("ess")    // <- any value

const m =
  new Map([[ key1, new Map([[ key2, someValue ]]) ]])

const result =
  m.get(key1).get(key2)

console.log(result === someValue) // true

In addition to get, Map offers has which answers whether our particular map, m, has a particular key, k -

const key1 =
  { a: "ay" }

const m =
  new Map([[ key1, "foo" ]])

console.log(m.has(key1))  // true
console.log(m.has("bar")) // false

recmap

We can use the high-performance properties of Map to write our own Recursive Map type, recmap. Before we dive into the implementation, here's how we want it to work -

import { empty, get, set } from "./recmap.js"

const t = empty()                 // <- empty recmap

set(t, ["a", 1, "b", 2], "hello") // <- set a deep value

const result =
  get(t, ["a", 1, "b", 2])        // <- retrieve a deep value

console.log(result) // "hello"

We can visualize t above as a recursive sequence of nested Maps -

Map { "a" => Map { 1 => Map { "b" => Map { 2 => "hello" }}}}

The user does not need to understand Array, Object, or Map to use it effectively. This is known as data abstraction and it's an important property of writing effective data structures. We will build our data type so that the nested structure is automatically managed by our module's operations -

// recmap.js

const empty = () =>
  new Map                         // <- representation of "empty" recmap

const has = (t, [ k, ...ks ]) =>  // <- recursive wrapper around Map.has
  ks.length == 0
    ? t.has(k)
    : t.has(k)
      ? has(t.get(k), ks)
      : false

const set = (t, [ k, ...ks ], v) => // <- recursive wrapper around Map.set
  ks.length == 0
    ? t.set(k, v)
    : t.has(k)
      ? (set(t.get(k), ks, v), t)
      : t.set(k, set(empty(), ks, v))

const get = (t, [ k, ...ks ]) =>    // <- recursive wrapper around Map.get
  ks.length == 0
    ? t.get(k)
    : t.has(k)
      ? get(t.get(k), ks)
      : undefined

export { empty, has, set, get }     // <- public module interface

Our recmap is extremely fast, able to filter a million values in less than one second. This is great, but there's on small problem: the purpose of this question is to compare objects to one another.

my second data structure

Hang in there, we are almost done. We designed recmap around Map so that the user doesn't have to concern him/herself with managing a nested structure. We will build unique around recmap so the user is not concerned with neither nested Maps nor key sequences. Instead we can simply create an empty collection and freely add as many values as we want -

import { empty, has, add, values } from "./unique.js"

const t = empty()             // <- empty unique

add(t, { a:1, b:2, c:3 })     // <- any object

const result =
  has(t, { a:1, b:2, c:3 })   // <- has any object?

console.log(result) // true

It doesn't matter which order the input object's keys are in. We will always get the correct answer -

console.log(has(t, { b:2, c:3, a:1 })) // true

It doesn't matter if we try to add the same object with different key orderings. It will only be added once -

add(t, { b:2, c:3, a:1 })
add(t, { b:2, a:1, c:3 })
add(t, { c:3, a:1, b:2 })

console.log(Array.from(values(t)))
[ { c:3, a:1, b:2 } ]   // <- only one distinct object is added

unique

This module is much easier to write thanks to our existing recmap. It is a simple wrapper around recmap.set and recmap.has with the addition of a toKey function which takes care of sequencing the input object. This technique of building complex data structures out of simpler ones is one every functional programmer must master -

// unique.js

import * as recmap from "./recmap.js"

const add = (t, v) =>         // <- wrapper around recmap.set
  recmap.set(t, toKey(v), v)

const has = (t, v) =>         // <- wrapper around recmap.has
  recmap.has(t, toKey(v))

const toKey = v =>            // <- private helper, sequences an object
  Object
    .entries(v)
    .sort((a,b) => a[0].localeCompare(b[0]))
    .flat()

const empty =
  recmap.empty   // <- same empty representation as recmap

const values =
  recmap.values  // <- todo

export { add, empty, has, values } // <- public interface

Above notice we did not export toKey. This is a private function only used by our module; an implementation detail the user does not need to worry about. Also note we called recmap.values into existence, we will implement that now -

// recmap.js (continued)

const empty = //...
const has = //...
const set = //...
const get = //...

function* values (t)
{ if (t instanceof Map)
    for (const v of t.values())
      yield *values(v)
  else
    yield t.unwrap
}

export
  { empty, has, set, get, values } // <- update export

benchmark

We've come along way. It's now time to measure the difference between our unique data structure and the linear filter algorithm proposed in Brandon's answer. To compare the two, we will use randCoord which generates random { x, y } objects -

// fixtures
const rand = x =>
  Math.floor(Math.random() * x)

const randCoord = size =>
  ({ x: rand(size), y: rand(size) })

const input =
  Array.from(Array(count), _ => randCoord(size))

The unique algorithm iterates through the input and adds each element to our collection -

console.time("unique")
const a = empty()
for (const x of input)
  add(a, x)
console.timeEnd("unique")
console.log(`uniques: ${Array.from(values(a)).length}`)

The filter algorithm is a polyadic chain of filter -> find -> every -> includes, as described above -

console.time("filter")
const b = filter(input)
console.timeEnd("filter")
console.log(`uniques: ${b.length}`)

With a size of randCoord(10), we generate coordinates from (0,0) to (10,10) with a maximum possibility of 10 x 10 = 100 unique coordinates -

count uniques algorithm runtime each
100 64 filter 2.26 0.023
1,000 100 filter 11.30 0.011
10,000 100 filter 96.73 0.010
100,000 100 filter 802.52 0.008

In this case, the linear filter algorith never has to check more than 100 elements to determine if one is unique. So even with a high count of 100,000 elements, duplicates can be detected somewhat easily. The unique data structure however detects duplicates faster -

count uniques algorithm runtime each
100 64 unique 9.68 0.097
1,000 100 unique 3.88 0.004
10,000 100 unique 12.16 0.001
100,000 100 unique 106.31 0.001

Your real-world data is using { name, age } objects and the unique combinations is probably much greater than 10 x 10. Next we'll see how as the possible uniques increases, the performance of filter hurts more and more. With a size of randCoord(100), we generate coordinates from (0,0) to (100,100) with a maximum possibility of 100 x 100 = 10,000 unique coordinates -

count uniques algorithm runtime each
100 98 filter 2.52 0.025
1,000 946 filter 48.63 0.049
10,000 6,340 filter 3,105.00 0.310
100,000 10,000 filter 67,460.00 0.675

Here filter starts to rear its head. At a modest 1,000 elements, the runtime is already 4x greater. At 10,000, we see a 32x increase. By contrast, unique handles the larger input space with effectively no increase in runtime -

count uniques algorithm runtime each
100 98 unique 9.76 0.098
1,000 946 unique 4.11 0.004
10,000 6,340 unique 13.10 0.001
100,000 10,000 unique 121.55 0.001

With a size of randCoord(1000), we generate coordinates from (0,0) to (1000,1000) with a maximum possibility of 1,000 x 1,000 = 1,000,000 unique coordinates. This is similar to a { first, last } object where there could be 1,000 unique first names and 1,000 unique last names -

count uniques algorithm runtime each
100 100 filter 0.39 0.004
1,000 999 filter 38.86 0.039
10,000 9,950 filter 3,710.00 0.371
100,000 95,186 filter 364,549.00 3.645

The noteworthy pattern we need to point out is that as the number of uniques increases, it takes filter longer to find each result. With a large count of 100,000 elements to check, it takes over 6 minutes to complete. Meanwhile unique computes the result with no increase in runtime as the number of uniques increases -

count uniques algorithm runtime each
100 100 unique 0.12 0.001
1,000 999 unique 0.97 0.001
10,000 9,950 unique 11.04 0.001
100,000 95,186 unique 124.99 0.001

real-world example

Above we compared objects with just two fields. In a final benchmark, let's see how the two algorithms compare when dealing with an object with seven (7) fields first, last, age, phone, gender, and email -

// fixtures
const rand = x =>
  Math.floor(Math.random() * x)

const randObj = () => ({
    first: rand(1000),
    last: rand(1000),
    age: rand(100),
    phone: rand(1e10),
    gender: rand(2),
    email: rand(1e20)
})

const input =
  Array.from(Array(count), randObj)

This is equal to 1000 x 1000 x 100 x 10,000,000,000 x 2 x 100,000,000,000,000,000,000 = 200,000,000,000,000,000,000,000,000,000,000,000,000, or 2e38, possible uniques -

count uniques algorithm runtime each
100 100 filter 1.95 0.019
1,000 1,000 filter 53.47 0.053
10,000 10,000 filter 4,435.00 0.444
100,000 100,000 filter 471,622.00 4.716
1,000,000 1,000,000 filter ? ?

To no surprise, not a single duplicate found, even in 100,000 objects. This demonstrates the worst-case scenario for both algorithms. I threw in a large benchmark of 1,000,000 elements just to stress-test the unique algorithm. While it would take filter over 12 hours to complete, unique determines the result in less than 5 seconds -

count uniques algorithm runtime each
100 100 unique 11.49 0.115
1,000 1,000 unique 6.55 0.007
10,000 10,000 unique 47.77 0.005
100,000 100,000 unique 404.93 0.004
1,000,000 1,000,000 unique 4,956.00 0.005

enhancement 1

A big advantage of writing our own data structures is we can change their behaviour and tune their performance as needed. If we add an object and attempt a lookup of using a subset of that object, we will get a true answer -

import { empty, add, has } from "./unique.js"

const t = empty()

add(t, {a: 1, b: 2})

console.log(has(t, {a: 1})) // true ❌

If the underlying representation of t is -

Map { "a" => Map { 1 => Map { "b" => Map { 2 => ... }}}}

That means has(t, {a: 1}) will find the nested map below, and return true -

{ "b" => Map { 2 => ... }}

You might think this is a problem with unique but the issue exists with recmap as well. This is a bug and we can fix it by introducing a simple checksum -

// recmap.js (continued)

const checksum = key =>
  [ key.length, ...key ]  // <- prepend sequence with length

const empty = // ...
const _get = // ... originally `get`, now private
const _has = // ... originally `has`, now private
const _set = // ... originally `set`, now private

// new wrappers: checksum the key and call private function
const get = (t, k) => _get(t, checksum(k))
const has = (t, k) => _has(t, checksum(k))
const set = (t, k, v) => _set(t, checksum(k), v)

export // ...

No changes are necessary for unique. We're back in business -

add(t, {a: 1, b: 2})
console.log(has(t, {a: 1})) // false ✅
console.log(has(t, {a: 1, b: 2})) // true ✅
console.log(has(t, {b: 2, a: 1})) // true ✅

enhancement 2

Our recmap module uses Map as the underlying representation but the user doesn't (shouldn't) know that. What happens if the user attempts to insert a Map of their own into the mix?Introducing a wrap utility, we can allow the user to insert any value type and prevent the leak of any implementation details -

// recmap.js (continued)

const checksum = // ...
const empty = // ...
const _get = // ... 
const _has = // ... 
const _set = // ...

const wrap = v =>       // <- wrap utility
  ({ unwrap: v })

const has = // ...
const get = (t, k) => _get(t, checksum(k)).unwrap  // <- unwrap
const set = (t, k, v) => _set(t, checksum(k), wrap(v)) // <- wrap

export // ...

enhancement 3

We designed recmap around native Map which offers a few other useful methods. We can add entries and keys to offer a more complete implementation -

// recmap.js (continued)

// ...

function* entries (t, path = [])
{ if (t instanceof Map)
    for (const [k, v] of t.entries())
      yield *entries(v, [...path, k])
  else
    yield [path.slice(1), t.unwrap ]
}

function* keys (t, path = [])
{ if (t instanceof Map)
    for (const [k, v] of t.entries())
      yield *keys(v, [...path, k])
  else
    yield path.slice(1)
}

export { ..., entries, keys } // <- update export

demo

An answer without a functioning demo is incomplete, imo. I've inlined the recmap and unique modules because we are limited to just one "file" for demonstration. We will borrow the input examples from Brandon's post -

const source1 =
  [ {age: 23, name: 'john'}
  , {age: 24, name: 'jane'}
  , {age: 25, name: 'sam'}
  , {age: 26, name: 'smith'}
  ]

const source2 =
  [ {age: 23, name: 'john'}
  , {age: 24, name: 'jane'}
  , {age: 25, name: 'sam'}
  , {age: 26, name: 'smith'}
  , {age: 27, name: 'mike'}
  , {age: 28, name: 'joanne'}
  , {age: 29, name: 'lois'}
  , {age: 30, name: 'paul'}
  ]

const recmap = (_ => {     // recmap module
  const empty = () => new Map
  const checksum = key => [ key.length, ...key ]
  const wrap = v => ({ unwrap: v })
  const _has = (t, [ k, ...ks ]) =>  ks.length == 0 ? t.has(k) : t.has(k) ? _has(t.get(k), ks) : false
  const _set = (t, [ k, ...ks ], v) => ks.length == 0 ? t.set(k, v) : t.has(k) ? (_set(t.get(k), ks, v), t) : t.set(k, _set(empty(), ks, v))
  const _get = (t, [ k, ...ks ]) => ks.length == 0 ? t.get(k) : t.has(k) ? _get(t.get(k), ks) : undefined
  const has = (t, keys) => _has(t, checksum(keys))
  const set = (t, keys, v) => _set(t, checksum(keys), wrap(v))
  const get = (t, keys) => _get(t, checksum(keys))?.unwrap
  function* values (t) { if (t instanceof Map) for (const v of t.values()) yield *values(v); else yield t.unwrap }
  return { empty, get, has, set, values }
})()

const unique = (_ => {     // unique module
  const toKey = v => Object.entries(v).sort((a,b) => a[0].localeCompare(b[0])).flat()
  const add = (t, v) => recmap.set(t, toKey(v), v)
  const has = (t, v) => recmap.has(t, toKey(v))
  const empty = recmap.empty
  const values = recmap.values
  return { add, empty, has, values }
})()

const source1 = [{age: 23, name: 'john'}, {age: 24, name: 'jane'}, {age: 25, name: 'sam'}, {age: 26, name: 'smith'}]
const source2 = [{age: 23, name: 'john'}, {age: 24, name: 'jane'}, {age: 25, name: 'sam'}, {age: 26, name: 'smith'}, {age: 27, name: 'mike'}, {age: 28, name: 'joanne'}, {age: 29, name: 'lois'}, {age: 30, name: 'paul'}]
const result = unique.empty()

for (const x of [...source1, ...source2])
  unique.add(result, x)

console.log(Array.from(unique.values(result)))

All duplicates are removed -

[ { age: 23, name: 'john' }
, { age: 24, name: 'jane' }
, { age: 25, name: 'sam' }
, { age: 26, name: 'smith' }
, { age: 27, name: 'mike' }
, { age: 28, name: 'joanne' }
, { age: 29, name: 'lois' }
, { age: 30, name: 'paul' }
]
Mulan
  • 129,518
  • 31
  • 228
  • 259
  • Thank you Brandon for hosting this learning opportunity. This answer currently exists as a draft. I will be adding a functioning code demo, some benchmark analysis, and two enhancements to `recmap` that make it even more useful. Stay tuned! – Mulan Apr 17 '21 at 20:52
  • Wow, thanks for this educative post, I knew this was a use-case for data structures, but my foundation is quite weak in DSA and I am willing to improve. Can you recommend a link/video or whatever resource that I could learn from extensively? Once again, I really do appreciate this more than you think! – thatguy Apr 18 '21 at 17:44
  • I'm happy to help :D I updated the post with some notes on the benchmarks as well as some improvements to the introduced data structures. I can recommend nothing more highly than [_Structure and Interpretation of Computer Porgrams_](https://sarabander.github.io/sicp/) by Harold Abelson and Gerald Jay Sussman. There's a series of video lectures to accompany the book on [MIT Open Courseware](https://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-001-structure-and-interpretation-of-computer-programs-spring-2005/video-lectures/). – Mulan Apr 19 '21 at 02:39
  • If you're interested in other posts I have on the topic, you can query my answers for ["data abstraction"](https://stackoverflow.com/search?tab=newest&q=user%3a633183%20data%20abstraction). Another query for ["tree"](https://stackoverflow.com/search?tab=newest&q=user%3a633183%20tree) contains other useful examples. If you have any questions, I'm here to help. – Mulan Apr 19 '21 at 02:45
0

As long as all your data is the same basic shape, I would concat it as soon as possible. You can use array destructuring in the reduce arguments ([item] instead of item) to effectively flatten it, remove duplicates with the reduce and finally sort.

const data1 = [
 [{age: 23, name: "john"}],
 [{age: 24, name: "jane"}],
 [{age: 25, name: "sam"}],
 [{age: 26, name: "smith"}]
];

const data2 = [
 [{age: 23, name: "bob"}],
 [{age: 24, name: "billy"}],
 [{age: 25, name: "sam"}],
 [{age: 26, name: "brenda"}]
];

const result = data1.concat(data2)
  .reduce((acc, [item], i, arr) => {
    if (!acc[item.name]) acc[item.name] = item;
    if (i === arr.length - 1) return Object.values(acc);
    return acc;
  }, {})
  .sort((a, b) => a.name.localeCompare(b.name));
  
console.log(result);
James
  • 20,957
  • 5
  • 26
  • 41
0

I think the your algorithm's complexity lives on "how to understand uniqueness" as your data seems to have no primary keys (e.g. ids).

In my example, an element is unique by age and name.

Perhaps you might want a different technique.

Sorting and Mapping is then easy enough :)

const toHash = ({ age, name }) => `${age}_${name}`.toUpperCase();

const fn = R.pipe(
  R.concat,
  R.flatten,
  
  // Remove Duplicates
  R.indexBy(toHash),
  R.values,

  // sort
  R.sortBy(R.prop('age')),
);

const a = [
  [{age: 23, name: 'john'}],
  [{age: 24, name: 'jane'}],
  [{age: 25, name: 'sam'}],
  [{age: 26, name: 'smith'}],
  [{age: 30, name: 'paul'}],
];

const b = [
  [{age: 27, name: 'mike'}],
  [{age: 23, name: 'john'}],
  [{age: 28, name: 'joanne'}],
  [{age: 29, name: 'lois'}],
  [{age: 30, name: 'paul'}],
];

console.log(
  fn(a, b),
);
<script src="https://cdnjs.cloudflare.com/ajax/libs/ramda/0.27.1/ramda.js" integrity="sha512-3sdB9mAxNh2MIo6YkY05uY1qjkywAlDfCf5u1cSotv6k9CZUSyHVf4BJSpTYgla+YHLaHG8LUpqV7MHctlYzlw==" crossorigin="anonymous"></script>
Hitmands
  • 13,491
  • 4
  • 34
  • 69
0
 const source1 = [
    [{age: 23, name: 'john'}],
    [{age: 24, name: 'jane'}],
    [{age: 25, name: 'sam'}],
    [{age: 26, name: 'smith'}]
  ];
  
  const source2 = [
    [{age: 23, name: 'john'}],
    [{age: 24, name: 'jane'}],
    [{age: 25, name: 'sam'}],
    [{age: 26, name: 'smith'}],
    [{age: 27, name: 'mike'}],
    [{age: 28, name: 'joanne'}],
    [{age: 29, name: 'lois'}],
    [{age: 30, name: 'paul'}]
  ];
  
 
let source  = [...source1,...source2].flat().reduce((initialState,currentObject)=>{
let [newSource,hashMap]=initialState
if( hashMap[currentObject.age] === undefined){
    hashMap[currentObject.age] = currentObject.name
    newSource.push(currentObject)
} 
return initialState
},[[],{}])[0]
  console.log({ source});

This pretty much has an O(n+m). the two spread operator has the complexity of O(n+m) before adding them into an array. function. also when trying to check for an element, it is always good to use hashMap or an object, this approach results in a time complexity of O(1) when searching instead of using array.includes which has the worst case of O(N). I know this solution is coming late but I hope it benefits everyone. less I forget the time complexity of the script is O(N * M) + O(N * M) + O(N * M) = 3 O(N * M) which is still O(N * M). The first O(N * M) is for the spread operator The second O(N * M) is for the flat function the third is for the reduced function Thanks all.