0

I am working on a side project where I am comparing two different databases and want to find the common elements of the data sets based on the "id" field. I want to know if there is an optimal solution instead of using two nested for loops. Is there a way to do it with a hash map? Many Thanks! Below is the sample code I am working with.

UPDATE all ids are unique with no possibility of there being a duplicate

// data set 1
const set1 = [
  {
    id: "001",
    name: "bob",
    age: "50",
    location: "texas"
  },
    {
    id: "002",
    name: "bill",
    age: "51",
    location: "texas"
  },
    {
    id: "003",
    name: "ben",
    age: "52",
    location: "texas"
  },
    {
    id: "004",
    name: "cam",
    age: "53",
    location: "texas"
  },
    {
    id: "005",
    name: "max",
    age: "54",
    location: "texas"
  }
]

// data set 2
const set2 = [
  {
    id: "001",
    name: "bob"
  },
  {
    id: "002",
    name: "bill"
  }
]

// I want to create a function where I find the the common elements of the two lists based on id and put the common element of data set 1 into a list and return that list

const findUnion(set1, set2) {
  // logic here, I know I can do a nested for loop but is there a more efficient way such as 
  // using a hashmap? ( Map() object? ) 
}

// desired output 
const output = [
  {
    id: "001",
    name: "bob",
    age: "50",
    location: "texas"
  },
    {
    id: "002",
    name: "bill",
    age: "51",
    location: "texas"
  }
]
  • Related: https://stackoverflow.com/questions/11426929/can-we-use-join-for-two-different-database-tables – nice_dev Jul 17 '22 at 19:20

3 Answers3

1

You can use Sets for efficient lookup:

const ids1 = new Set(set1.map(({id}) => id));
const ids2 = new Set(set2.map(({id}) => id));
const output = set1.filter(({id}) => ids1.has(id) && ids2.has(id));
console.log(output);
AKX
  • 152,115
  • 15
  • 115
  • 172
0

First we combine into one long array. Then group by id using reduce method. Each group contains the item and count of appearances. Finally, for each of the groups, return only those with count of appearances > 1.

Edit: fixed algorithm see code.

Edit 2: Made it more generic so order of items won't matter. This is by extending the duplicates rather then replacing them.

function findUnion(set1, set2) {

  // first remove duplicates from each set
  // bonus: collect duplicates
  var duplicates;

  function dedup(set) {
    duplicates = []
    return Object.values(set.reduce(function(agg, item) {
      var merged = item;
      if (agg[item.id]) {
        merged = { ...agg[item.id],
          ...item
        }
        duplicates.push(merged)
      }
      agg[item.id] = merged;
      return agg
    }, {}));
  }

  set1 = dedup(set1);
  set2 = dedup(set2);

  // then combine
  var combined = [...set1, ...set2]

  // then remove duplicates again, this time keep them
  dedup(combined)
  return duplicates;
}

// data set 1
const set1 = [{
    id: "001",
    name: "bob",
    age: "50",
    location: "texas"
  },
  {
    id: "002",
    name: "bill",
    age: "51",
    location: "texas"
  },
  {
    id: "003",
    name: "ben",
    age: "52",
    location: "texas"
  },
  {
    id: "004",
    name: "cam",
    age: "53",
    location: "texas"
  },
  {
    id: "005",
    name: "max",
    age: "54",
    location: "texas"
  },
  {
    id: "005",
    name: "max",
    age: "54",
    location: "texas"
  },
  {
    id: "005",
    name: "max",
    age: "54",
    location: "texas"
  }
]

// data set 2
const set2 = [{
    id: "001",
    name: "bob"
  },
  {
    id: "002",
    name: "bill"
  }
]


// desired output 
const output = [{
    id: "001",
    name: "bob",
    age: "50",
    location: "texas"
  },
  {
    id: "002",
    name: "bill",
    age: "51",
    location: "texas"
  }
]

console.log(findUnion(set1, set2))
IT goldman
  • 14,885
  • 2
  • 14
  • 28
  • 1
    Yes I overlooked the possibility of duplicates in each list. Fix was to *remove* duplicates first. Then I refactored to re-use that code to *find* duplicates. – IT goldman Jul 17 '22 at 21:49
  • Note the difference between the requested output and yours. Think it would be fixed if you set `combined` to `[...set2, ...set1]`. – Scott Sauyet Jul 18 '22 at 18:58
0

First of all, you're looking for the intersection, not the union.

As others have said, we can use a Set to track uniqueness. This gives us near O(1) lookup time, and allows us algorithm that runs in something like O(m + n) time where m and n are the sizes of your sets:

const intersection = (s1, s2, ids = new Set (s2 .map (x => x .id))) =>
  s1 .filter (({id}) => ids .has (id))

const set1 = [{id: "001", name: "bob", age: "50", location: "texas"}, {id: "002", name: "bill", age: "51", location: "texas"}, {id: "003", name: "ben", age: "52", location: "texas"}, {id: "004", name: "cam", age: "53", location: "texas"}, {id: "005", name: "max", age: "54", location: "texas"}]
const set2 = [{id: "001", name: "bob"}, {id: "002", name: "bill"}]

console .log (intersection (set1, set2))
.as-console-wrapper {max-height: 100% !important; top: 0}
Scott Sauyet
  • 49,207
  • 4
  • 49
  • 103