I think a simple choice that is quadratic in the length of the list of words is to verify that every pairing of list indices has the same equality characteristics in the two lists. I'll assume that you get the "words" and "pattern" as lists already and don't need to parse out spaces and whatever -- that ought to be a separate function's responsibility anyway.
function matchesPatternReference(words, pattern) {
if(words.length !== pattern.length) return false;
for(var i = 0; i < words.length; i++)
for(var j = i+1; j < words.length; j++)
if((words[i] === words[j]) !== (pattern[i] === pattern[j]))
return false;
return true;
}
A slightly better approach would be to normalize both lists, then compare the normalized lists for equality. To normalize a list, replace each list element by the number of unique list elements that appear before its first occurrence in the list. This will be linear in the length of the longer list, assuming you believe hash lookups and list appends take constant time. I don't know enough Javascript to know if these are warranted; certainly at worst the idea behind this algorithm can be implemented with suitable data structures in n*log(n) time even without believing that hash lookups are constant time (a somewhat questionable assumption no matter the language).
function normalize(words) {
var next_id = 0;
var ids = {};
var result = [];
for(var i = 0; i < words.length; i++) {
if(!ids.hasOwnProperty(words[i])) {
ids[words[i]] = next_id;
next_id += 1;
}
result.push(ids[words[i]]);
}
return result;
}
function matchesPatternFast(words, pattern) {
return normalize(words) === normalize(pattern);
}
Note: As pointed out in the comments, one should check deep equality of the normalized arrays manually, since ===
on arrays does an identity comparison in Javascript and does not compare elementwise. See also How to check if two arrays are equal with Javascript?.
Addendum: Below I argue that matchesPatternFast
and matchesPatternReference
compute the same function -- but use the faulty assumption that ===
on arrays compares elements pointwise rather than being a pointer comparison.
We can define the following function:
function matchesPatternSlow(words, pattern) {
return matchesPatternReference(normalize(words), normalize(pattern));
}
I observe that normalize(x).length === x.length
and normalize(x)[i] === normalize(x)[j]
if and only if x[i] === x[j]
; therefore matchesPatternSlow
computes the same function as matchesPatternReference
.
I will now argue that matchesPatternSlow(x,y) === matchesPatternFast(x,y)
. Certainly if normalize(x) === normalize(y)
then we will have this property. matchesPatternFast
will manifestly return true
. On the other hand, matchesPatternSlow
operates by making a number of queries on its two inputs and verifying that these queries always return the same results for both lists: outside the loop, the query is function(x) { return x.length }
, and inside the loop, the query is function(x, i, j) { return x[i] === x[j]; }
. Since equal objects will respond identically to any query, it follows that all queries on the two normalized lists will align, matchesPatternSlow
will also return true
.
What if normalize(x) !== normalize(y)
? Then matchesPatternFast
will manifestly return false
. But if they are not equal, then either their lengths do not match -- in which case matchesPatternSlow
will also return false
from the first check in matchesPatternReference
as we hoped -- or else the elements at some index are not equal. Suppose the smallest mismatching index is i
. It is a property of normalize
that the element at index i
will either be equal to an element at index j<i
or else it will be one larger than the maximal element from indices 0
through i-1
. So we now have four cases to consider:
- We have
j1<i
and j2<i
for which normalize(x)[j1] === normalize(x)[i]
and normalize(y)[j2] === normalize(y)[i]
. But since normalize(x)[i] !== normalize(y)[i]
we then know that normalize(x)[j1] !== normalize(y)[i]
. So when matchesPatternReference
chooses the indices j1
and i
, we will find that normalize(x)[j1] === normalize(x)[i]
is true
and normalize(y)[j1] === normalize(y)[i]
is false
and immediately return false
as we are trying to show.
- We have
j<i
for which normalize(x)[j] === normalize(x)[i]
and normalize(y)[i]
is not equal to any previous element of normalize(y)
. Then matchesPatternReference
will return false
when it chooses the indices j
and i
, since normalize(x)
matches on these indices but normalize(y)
doesn't.
- We have
j<i
for which normalize(y)[j] === normalize(y)[i]
and normalize(x)[i]
is not equal to any previous element of normalize(x)
. Basically the same as in the previous case.
- We have that
normalize(x)[i]
is one larger than the largest earlier element in normalize(x)
and normalize(y)[i]
is one larger than the largest earlier element in normalize(y)
. But since normalize(x)
and normalize(y)
agree on all previous elements, this means normalize(x)[i] === normalize(y)[i]
, a contradiction to our assumption that the normalized lists differ at this index.
So in all cases, matchesPatternFast
and matchesPatternSlow
agree -- hence matchesPatternFast
and matchesPatternReference
compute the same function.