0

I am working quite a bit with Maps in javascript. I need the most computationally efficient way to find all items that are in Map a that are not present in Map b. For example,

const a = new Map();
a.set('item1', 'item1value');
a.set('item2', 'item2value');


const b = new Map();
b.set('item1', 'item1value');

The result of the function I'm looking to write would be another Map, with a single entry of key: item2, value: item2value.

I am well aware of the multitude of questions / answers / methods to do this with arrays and objects, however I have not seen such an explanation for maps. I need the absolute most efficient way to do so, as I will need to be call this function up to thousands of times quickly. Is conversion to an array and back to a Map the best way? Are there any tricks with Maps that may help?

Seth Lutske
  • 9,154
  • 5
  • 29
  • 78
  • do you need a symetrically difference? – Nina Scholz Dec 31 '21 at 15:39
  • I don't think it matters in my case, as I'm pretty sure for my given scenario, Map b can never have entries not present in Map a, but Map a can have entries not present in Map b. – Seth Lutske Dec 31 '21 at 15:41
  • Is `b` always going to have less (or equal) elements as `a`? – VLAZ Dec 31 '21 at 15:44
  • @VLAZ I would expect it to, yes – Seth Lutske Dec 31 '21 at 15:46
  • @SethLutske thanks. Second question - is map `b` always a subset of map `a`? – VLAZ Dec 31 '21 at 15:47
  • @VLAZ - yes it should be – Seth Lutske Dec 31 '21 at 15:47
  • 1
    If your values are always the same as the respective key, you shouldn't use a `Map`, you should use a `Set`! Then see https://2ality.com/2015/01/es6-set-operations.html https://stackoverflow.com/q/31930894/1048572 https://stackoverflow.com/q/1723168/1048572 – Bergi Dec 31 '21 at 15:52
  • @Bergi, %, but that's just for this simplified example. In my real world, that's not the case at all – Seth Lutske Dec 31 '21 at 15:55
  • @SethLutske If key and value are different, then please clarify what you mean by "item". Are you looking for keys, values, or entries (key-value-pairs) that are present in `a` but not in `b`? – Bergi Dec 31 '21 at 15:57
  • @Bergi I updated the question a bit to clear that up. As I stated in the question, I am looking for a new map containing the key-value pairs that are present in Map a, but not present in Map b – Seth Lutske Dec 31 '21 at 16:01
  • So if `b` was `new Map([['item1', 'differentValue']])`, you would want to get the entry `['item1', 'item1value']` of `a` in your result as well? – Bergi Dec 31 '21 at 16:06
  • Iterate over Map B and delete the corresponding items in Map A. – Redu Dec 31 '21 at 16:07
  • @Bergi, yes. But as I mentioned, I don't expect there to be any items in Map b that are not present in Map a, rather only items in Map a that are not present in Map b. So its sort of a uni-directional XOR operation I'm looking for, I think – Seth Lutske Dec 31 '21 at 16:10
  • 2
    "as I will need to be call this function up to thousands of times quickly" ... that sounds like you should rethink your overall data model. – Jonas Wilms Dec 31 '21 at 16:13
  • Loop over b and delete from a. That won't allocate any heap memory. – Jonas Wilms Dec 31 '21 at 16:16
  • @JonasWilms, well yeah. I am writing a computationally heavy program, so rethinking my overall data models and algorithms is my bread and butter right now. So this question is part of that process. Would you mind elaborating on how I can loop over b and delete from a, while still maintaining a copy of a? And how that relates to heap memory, and its use/nonuse thereof? That is the direction I am trying to take this discussion – Seth Lutske Dec 31 '21 at 16:16
  • 2
    Why do you need a copy "a thousand times"? This is a classic XY problem – Jonas Wilms Dec 31 '21 at 16:26
  • @SethLutske You might want to [ask a separate question](https://stackoverflow.com/questions/ask) where you describe the actual problem that you're trying to solve. Why do you have thousands of maps? What differences do you need to compute? – Bergi Dec 31 '21 at 16:29
  • @Bergi, you guys are on the right track, but the larger algorithm that this question belongs to is (I think) too complex for an SO question. Perhaps it is best suited for a question on the computations science SE. If you're curious, I have a matrix that evolves over time. In each timestep, I need to find the border cells of groups of non-zero values... – Seth Lutske Dec 31 '21 at 16:36
  • @JonasWilms ... This is essentially an edge detection problem, similar to [How to find border of object represent in matrix ( coordinate system)](https://stackoverflow.com/questions/18060385/how-to-find-border-of-object-represent-in-matrix-coordinate-system), but I am trying to make it more efficient by considering that for every timestep, cells that were once border cells, and are no longer border cells, do not need to be checked again. But thank you guys for asking the big questions, I wish I had more people to discuss this kind of stuff openly with – Seth Lutske Dec 31 '21 at 16:37
  • If you guys are interested, I can write up a question on the computational science SE, but frankly Its only worth it if I know people will look at it, as that SE doesn't get anywhere near as much traffic as SO. If you 2 are willing to take a look at the larger issue in question, I'll write it up and link it here – Seth Lutske Dec 31 '21 at 16:38
  • Without having the big picture: Create another matrix containing colors (border|no-border), then swap the arrays (prev|current). That way you won't allocate memory at each step and moving left-right on the matrix is a contiguous memory access (best case). Also accessing the prev marker is O(1) and not O(1) best case. – Jonas Wilms Dec 31 '21 at 16:44
  • @JonasWilms would you mind elaborating, perhaps in an answer? I'm not sure I follow – Seth Lutske Dec 31 '21 at 16:47

3 Answers3

2

No, don't convert the maps to arrays and back. Computing the difference between arrays is slow, in a map you have O(1) lookup. Just loop through the entries of a and put them into the result if you don't find an equivalent entry in b. This will have the optimal time complexity O(n) (where n is the size of the map a).

const result = new Map();
for (const [k, v] of a) {
    if (v === undefined && !b.has(k) || b.get(k) !== v) {
        result.set(k, v);
    }
}

If you know that your map doesn't contain undefined values, you can omit the v === undefined && !b.has(k) || entirely and might get some speedup. Also, notice that if your map can contain NaN values, you'll want to use Object.is instead of ===.

If you want to write it as a single fancy expression, consider a generator:

const result = new Map(function*() {
    for (const e of a) {
        const [k, v] = e;
        if (v === undefined && !b.has(k) || b.get(k) !== v) {
            yield e;
        }
    }
}());
Bergi
  • 630,263
  • 148
  • 957
  • 1,375
  • Thank you for this, this is the type of answer I was looking for. Some commenters mentioned looping over a and deleting entries that are also found in b. I didn't like that because I need to maintain a. This is more in line with what I need, as its additive rather than subtractive. Would you say there's a performance difference between an additive and subtractive approach? – Seth Lutske Dec 31 '21 at 16:23
  • Well mutating the input doesn't need to allocate new memory for the result so can be faster, but if you need an immutable operation then you have no choice. – Bergi Dec 31 '21 at 16:25
  • You can improve by avoiding the double lookup of both `b.has()` and `b.get()` for most items. – jfriend00 Dec 31 '21 at 16:52
  • @jfriend00 That's what my second paragraph says. Or do you mean I should write `!(v === undefined && b.has(k)) || b.get(k) !== v`? – Bergi Dec 31 '21 at 16:53
  • I mean you call `.get()`, and only call `.has()` if the value from `.get()` is `undefined`. – jfriend00 Dec 31 '21 at 16:57
  • @Bergi Redu's answer suggests simply copying the map, then using a subtractive approach to the copy. How might that compare to this add-one-at-a-time approach in terms of performance? – Seth Lutske Dec 31 '21 at 16:59
  • 1
    @SethLutske It'll be `O(size(a) + size(b))`, whereas my approach is `O(size(a))`. So unless `b` is significantly smaller than `a`, and unless copying the map all-at-once is faster than copying it one-at-a-time (maybe because of special-cased memory allocation?), I doubt it's faster. But really, [you should just race your horses](https://ericlippert.com/2012/12/17/performance-rant/). – Bergi Dec 31 '21 at 17:06
1

You could iterate the first map and check against the second.

const
    a = new Map([['item1', 'item1'], ['item2', 'item2']]),
    b = new Map([['item1', 'item1']]),
    difference = (a, b) => {
        const d = new Map;

        a.forEach((v, k) => {
            if (!b.has(k) || b.get(k) !== v) d.set(v, k);
        });
        return d;
    }

console.log([...difference(a, b)])
Nina Scholz
  • 376,160
  • 25
  • 347
  • 392
  • I thought of something like this, I just wasn't sure if a for loop was going to be the most efficient way to do it. Any thoughts on this method's cpu impact? – Seth Lutske Dec 31 '21 at 15:49
  • This can be improved by not doing both `b.has()` and `b.get()`. That's double lookup. You only have to do the double lookup if `b.get()` returns `undefined`. – jfriend00 Dec 31 '21 at 16:22
  • I think you messed up the condition. Your snippet results in an empty map but shouldn't. – Bergi Dec 31 '21 at 16:26
  • @jfriend00 a proper JIT compiler will eliminate the second lookup. – Jonas Wilms Dec 31 '21 at 16:30
  • sorry, please see edit. – Nina Scholz Dec 31 '21 at 16:47
  • @JonasWilms - And, what evidence do you have that the JIT here is actually avoiding the double lookup and that modifying the code to do so wouldn't be useful? – jfriend00 Dec 31 '21 at 16:53
0

Perhaps like this by maintaining a;

var a = new Map(),
    b = new Map(),
    c;

a.set('item1', 'item1value');
a.set('item2', 'item2value');
b.set('item1', 'item1value');

c = new Map(a);
b.forEach((_,k) => c.delete(k));

console.log(c); // try in dev tools to see result
Redu
  • 25,060
  • 6
  • 56
  • 76
  • Thank you for this. Would you mind explaining this method's effect on cpu usage? One of the commenters mentioned this method doesn't allocate any heap memory, which sounds good, though I'm not sure why, or how this method differs from the additive method in Bergi's answer. Mind elaborating a bit? – Seth Lutske Dec 31 '21 at 16:25
  • 1
    According to the OP's comments on the question, you should delete entries in `c` only if they also have the same value, not just the same key – Bergi Dec 31 '21 at 16:28
  • @Bergi Couldn't elaborate what you say from the question. If that's required than one might try `b.forEach((v,k) => c.has(k) && (c.get(k) === v) && c.delete(k));` guaranteeing possible keys with `undefined` values to be deleted. – Redu Dec 31 '21 at 16:41
  • @SethLutske Sorry no, i can not. It might be helpful to set up a benchmark run at one of the online suits like [jsben.ch](https://jsben.ch/), [jsbench.me](https://jsbench.me/) or [Perflink](https://perf.link/) and post the results here. – Redu Dec 31 '21 at 16:49