1

I know I can implement a solution like the following one: Remove duplicate objects from an array using javascript, that is the concatenation of items outputting a string.

However, since my object (or array) representing a network flow must contain 4 or more items (i.e., Source IP, Destination IP, Source Port, Destination Port) in different positions, the concatenation is not helpful here since I should create 4 permutation-strings to compare them. So I'm trying to understand if a more efficient solution exists.

Assuming to have the following 4 objects in javascript:

1. { srcip: 192.168.1.10, dstip: 192.168.1.20, srcport: 5000, dstport: 443 }
2. { srcip: 192.168.1.20, dstip: 192.168.1.10, srcport: 443, dstport: 5000 }
3. { srcip: 192.168.1.10, dstip: 192.168.1.20, srcport: 5000, dstport: 80 }
4. { srcip: 192.168.1.30, dstip: 192.168.1.20, srcport: 5000, dstport: 443 }

only objects 1 and 2 are duplicate; in other words, objects are duplicate when all their elements are identical, even if they are swapped (Source IP with Destination IP and Source Port with Destination Port). Of course, the same data can be stored in array, no matter.

1. [192.168.1.10, 192.168.1.20, 5000, 443]
2. [192.168.1.20, 192.168.1.10, 443, 5000]
3. [192.168.1.10, 192.168.1.20, 5000, 80]
4. [192.168.1.30, 192.168.1.20, 5000, 443]

Do you have any idea how to solve this issue?

UPDATE

Reading your comments and solutions, I just want to add a clarification. An object must be equal to another if the two pairs "IP/ports" are identical, even if they are switched. So, as described above, flow 1 and 2 should be equal, but the following flow is different:

{ srcip: 192.168.1.20, dstip: 192.168.1.10, srcport: 5000, dstport: 443 }

since only its IPs are switched (but not ports) with respect to flow 1.

redcrow
  • 1,743
  • 3
  • 25
  • 45
  • Four permutation? If ip remains same but you switch ports it's still duplicate? Ex. `{ srcip: 192.168.1.10, dstip: 192.168.1.20, srcport: 443, dstport: 5000 }` is also duplicate of 1 and 2? – barbsan Aug 03 '18 at 12:05
  • Object comparison is always delicate in JS. See also [this thread](https://stackoverflow.com/questions/201183/how-to-determine-equality-for-two-javascript-objects). It looks like Lodash has a miracle `_.isEqual()` function. – Jeremy Thille Aug 03 '18 at 12:07
  • So than you need to do something that is more complicated if two different keys can make it a dupe. You code put the lowest number first, etc. – epascarello Aug 03 '18 at 12:08
  • @barbsan: I have added a clarification in my post. @JeremyThille: excellent library and function, I didn't know that. However, if I use that function with flows 1 and 2, it will output `false`, but in my case they are duplicates. – redcrow Aug 06 '18 at 08:34

2 Answers2

1

First create strings like "{ip}:{port}" (or use any other separator than :), then sort them and join to get single string

var arr = [{ srcip: "192.168.1.10", dstip: "192.168.1.20", srcport: 5000, dstport: 443 },
 { srcip: "192.168.1.20", dstip: "192.168.1.10", srcport: 443, dstport: 5000 },
 { srcip: "192.168.1.10", dstip: "192.168.1.20", srcport: 5000, dstport: 80 },
 { srcip: "192.168.1.30", dstip: "192.168.1.20", srcport: 5000, dstport: 443 }
 ]
 
 var arrForRemovingDupes = arr.map(el => [el.srcip + ":" +el.srcport, el.dstip + ":" +el.dstport].sort().join())
 
 console.log(arrForRemovingDupes)
barbsan
  • 3,418
  • 11
  • 21
  • 28
  • Thanks, your solution could be a good method. Let's wait for a while to see if other solutions exist... otherwise I will accept yours! :) – redcrow Aug 06 '18 at 13:53
0

Having above data stored in arrays, you can sort those arrays and join to create string keys. With a list of keys you can find duplicates easily:

const data = [
  ['192.168.1.10', '192.168.1.20', 5000, 443],
  ['192.168.1.20', '192.168.1.10', 443, 5000],
  ['192.168.1.10', '192.168.1.20', 5000, 80],
  ['192.168.1.30', '192.168.1.20', 5000, 443],
]

const keys = data.map(item => item.sort().join());

Output:

[
  "192.168.1.10,192.168.1.20,443,5000",
  "192.168.1.10,192.168.1.20,443,5000", // equals to the previous one
  "192.168.1.10,192.168.1.20,5000,80",
  "192.168.1.20,192.168.1.30,443,5000"
]

If you want to group them having the original objects, try with;

const grouped = data.reduce((acc, item) => {
  const key = item.sort().join();
  acc[key] = (acc[key] || []).concat([item]);
  return acc;
}, {});
hsz
  • 148,279
  • 62
  • 259
  • 315
  • Thanks for your solution @hsz, but it could happen this flow as well (sorry if I didn't include in my original example): `['192.168.1.20', '192.168.1.10', 5000, 443]`. So, this flow hasn't to be equal to the flow 1 and 2. – redcrow Aug 06 '18 at 08:03