2

I have a plain string

"A <br> B <br/> C <br /> D <br/>" 

and a set of possible substrings, like

['<br>','<br/>','<br />'];

It's trivial to find the index in the whole string of the nth occurence of a particular string in a string, so i can find the index in the whole string of the nth '<br>' or the nth '<br/>' but how it possible to find the nth occurence of any of these string?

For example, if i need the 2' occurence it would be in this case at 9' character, that is counting as first occurence the first <br> AND as second occurence the second <br/>

EDIT: Find the index of the nth occurence of a particular string can be done like this

var index = string.split('<br>', 2).join('<br>').length;

so I can find separate occurences. The problem is to find the occurence of any of these strings.

Aditya Menon
  • 744
  • 5
  • 13
Sasha Grievus
  • 2,566
  • 5
  • 31
  • 58

2 Answers2

3

You could try using Regular Expressions like this:

let testString="A <br> B <br/> C <br /> D <br/>";

//Array of substrings to be searched for
let testItems=['<br>','<br/>','<br />']; 

//Construction of the regular expression from the given array
let regex=new RegExp(testItems.join("|"),"g"); 

//Getting all matches for the provided regex and reducing it to an array of indices only
let indices=[...testString.matchAll(regex)].map(match=>match.index);

console.log(indices);

The nth occurrence can be easily recovered from the indices array. You could also modify this if you need to know which substring was hit as well.

UPDATE

The above answer does not take into account the possibility of search items containing regex special characters. To handle that scenario one must manually escape the inputs as shown below.

let testString="A <br> B <br/> C <br /> D <br/> E <br $> F <br []/>";

//Array of substrings to be searched for
let testItems=['<br>','<br/>','<br />','<br $>','<br []/>']; 

//Function to return an escaped version of the input string
let escapeRegex=(string)=> string.replace(/[-\/\\^$*+?.()|[\]{}]/g, '\\$&');

//Construction of the regular expression from the given array
let regex=new RegExp(testItems.map(item=>escapeRegex(item)).join("|"),"g");

//Getting all matches for the provided regex and reducing it to an array of indices only
let indices=[...testString.matchAll(regex)].map(match=>match.index);

console.log(indices);

The function for escaping the input was borrowed from an answer to the question Is there a RegExp.escape function in JavaScript?

Thank you to @ScottSauyet and @Ivar for their feedback

Aditya Menon
  • 744
  • 5
  • 13
  • 1
    This works, but with a pretty serious restriction that you can't use regex special characters in your search terms. For instance searching `'A
    B
    C
    D
    '` for `['
    ','
    ','
    ']` yields only `[2]`
    – Scott Sauyet Dec 07 '20 at 14:17
  • @ScottSauyet That is absolutely true. Thanks for pointing it out! – Aditya Menon Dec 07 '20 at 14:25
  • 1
    Can be quite easily resolved [by escaping](https://stackoverflow.com/questions/3561493/is-there-a-regexp-escape-function-in-javascript) the entries before joining them. – Ivar Dec 07 '20 at 14:26
  • I assumed that when you passed it through `new RegExp` it would automatically escape everything. Works for `/` as it becomes `\/`. Didnt try with any other special characters – Aditya Menon Dec 07 '20 at 14:29
  • If the `RegExp` constructor would escape everything, it would not be possible to pass a regex to it. A slash only needs to be escaped inside a regex literal (`/.../`) because it uses`/` to mark the end of the literal. If you're passing it as a string to the `RegExp` that's not the case, so not necessary. I would encourage you to update your answer to include the escaping. That way it might be more useful to other people as well. (Or even to OP if the input is not static.) – Ivar Dec 07 '20 at 14:37
  • 1
    @AdityaMenon: What Ivar is suggesting could be accomplished with just `let regex = new RegExp(testItems.map(s => s.replace(/[.*+\-?^${}()|[\]\\]/g, '\\$&')).join("|"),"g")` – Scott Sauyet Dec 07 '20 at 16:44
2

A recursive solution makes sense here, so long as you're not expecting to look for the 10,000th match or some such. Here's one approach:

const indexOfAny = (string, substrings, pos = 0) => {
  const positions = substrings .map (ss => string .indexOf (ss, pos)) 
                               .filter (n => n > -1)
  return positions .length == 0 ? -1 : Math .min (... positions)
}

const nthOfAny = (n, string, subtrings, pos = 0) => {
  const first = indexOfAny (string, substrings, pos)
  return n <= 1
    ? first
    : first == -1
      ? -1
      : nthOfAny (n - 1, string, substrings, 1 + indexOfAny (string, substrings, pos))
}

const string = 'A <br> B <br/> C <br /> D <br/>'
const substrings = ['<br>','<br/>','<br />']

console. log (
  nthOfAny (2, string, substrings)
)

console.log (
  [1, 2, 3, 4, 5, 6] .map (n => nthOfAny (n, string, substrings))
)

We first define indexOfAny which takes a string, an array of substrings to search for, and (optionally) an initial position, and it returns the position of the first one found, or -1 if none is found.

Then nthOfAny takes an ordinal (1 for first, 2 for second, etc.) and the same arguments as above and recursively finds that one by increasing pos to the previously found one and decreasing n until it hits a base case that just returns the result of indexOfAny.

There is a fair bit of extra complexity meant to use -1 to signal that nothing isn't found, to match indexOf. A simpler solution would return Infinity for that case:

const indexOfAny = (string, substrings, pos = 0) => 
  Math.min (
    ... substrings .map (ss => string .indexOf (ss, pos)) .filter (n => n > -1)
  )

const nthOfAny = (n, string, subtrings, pos = 0) => 
  n <= 1
    ? indexOfAny (string, substrings, pos)
    : nthOfAny (n - 1, string, substrings, 1 + indexOfAny (string, substrings, pos))
Scott Sauyet
  • 49,207
  • 4
  • 49
  • 103