I can find if a list's elements are sequentially within other lists (as described here), but am having trouble with some 'messy' data.
For example:
var source = ['the', 'dog', 'therefore', 'he', 'gets', 'a', 'treat'];
var search = ['there', 'fore', 'gets', 'treat']
There are two ways this query data is 'messy'. First, some of the search terms have been separated ('there', 'fore'
). Second, some characters are omitted('he', 'a'
).
How to find the starting and ending index of the 'messy' search list members in the source list? (in my example above, I would want to get back [2,6]
which corresponds to therefore
@ 2 and treat
@ 6 in the source
list).
Your problem is underspecified.
What's the result for source = ['a', 'aa', 'a', 'b', 'a']],
search = ['a', 'a']? Is it [0, 4] or [0, 2] or [1, 1] or ...?
You could e.g. ask for the first, longest matching 'messy' subsequence. – le_m
Good point and good question. I only need to skip single elements when searching source
, and would want to get back the first match (and could extend the function to include a starting index in search).