0

I can find if a list's elements are sequentially within other lists (as described here), but am having trouble with some 'messy' data.

For example:

var source = ['the', 'dog', 'therefore', 'he', 'gets', 'a', 'treat'];
var search = ['there', 'fore', 'gets', 'treat']

There are two ways this query data is 'messy'. First, some of the search terms have been separated ('there', 'fore'). Second, some characters are omitted('he', 'a').

How to find the starting and ending index of the 'messy' search list members in the source list? (in my example above, I would want to get back [2,6] which corresponds to therefore@ 2 and treat @ 6 in the source list).


Your problem is underspecified.
What's the result for source = ['a', 'aa', 'a', 'b', 'a']], 
search = ['a', 'a']? Is it [0, 4] or [0, 2] or [1, 1] or ...? 
You could e.g. ask for the first, longest matching 'messy' subsequence. – le_m 

Good point and good question. I only need to skip single elements when searching source, and would want to get back the first match (and could extend the function to include a starting index in search).

Community
  • 1
  • 1
jedierikb
  • 12,752
  • 22
  • 95
  • 166
  • 2
    is this javascript or python? Why both? – A. L May 18 '17 at 01:45
  • So essentially, you're asking to check whether two arrays are the same or not? Just run a `foreach` loop to check each element in `search` against each element in `source` for a direct match. Inside of the loop you can log the position that it doesn't match at. – Obsidian Age May 18 '17 at 01:45
  • @ObsidianAge `search`'s elements do not appear in sequence in `source`. Asking for help in handling missing elements (`he`,`a`) and elements in `source` split into two elements in `search`. – jedierikb May 18 '17 at 01:53
  • 1
    Your problem is underspecified. What's the result for `source = ['a', 'aa', 'a', 'b', 'a']], search = ['a', 'a']`? Is it `[0, 4]` or `[0, 2]` or `[1, 1]` or ...? You could e.g. ask for the first, longest matching 'messy' subsequence. – le_m May 18 '17 at 01:56
  • The criteria is not clear. Can you please describe it again `[2,6]` corresponds to `therefore` & `treat` right? – brk May 18 '17 at 01:57

1 Answers1

1

Going to make some assumptions:

The values in search are unique so no ['treat', 'treat']

The values in source are also unique.

As far as effectiveness/efficiency I can't really help you there. I hope this gives you a good idea of how to start.

var source = ['the', 'dog', 'therefore', 'he', 'gets', 'a', 'treat'];
var search = ['there', 'fore', 'gets', 'treat'];


let start, finish;
start = finish = -1;

for (let word of search)
{
  for (let i in source)
  {
    if (source[i].indexOf(word) !== -1)
    {
      if (start === -1)
      {
        start = finish = i;
      }
      else if (i > finish)
      {
        finish = i;
      }
      else if (i < start)
      {
        start = i;
      }
      break;
    }
  }
}

console.log(start, finish);
A. L
  • 11,695
  • 23
  • 85
  • 163