1

I need to get the index position value of submatched string. As per documentation, I have read through this Regular expression and got to know FirstIndex property to get the position of matched string.

But this works only for one dimensional matched string. I couldn't apply FirstIndex for submatches. Pls refer sample matches

I tried this format,

        Dim myRegExp As Object, match As MatchCollection            
        Dim matched As String
        Set myRegExp = CreateObject("VBScript.RegExp")
        myRegExp.pattern = find
        If myRegExp.test(text) = True Then
        Set match = myRegExp.Execute(text)          
        Debug.Print match(0).submatches(0) '' this is matched string

Where should I call FirstIndex to get position of submatched string

output:

match(0)=>Berry, Brent. (2006). What accounts for race and ethnic differences in  Berry, 
Brent. parental financial transfers to adult children in the United States? Journal of Family
Issues 37:1583-1604.   

submatches(0)=>Berry, Brent.
submatches(6)=>2006

EXPECTED OUTPUT:

submatches(0) at 0th position
submatches(6) at 16th position and so on
Learning
  • 848
  • 1
  • 9
  • 32

2 Answers2

4

You can't apply .FirstIndex to SubMatches(x) because it returns a String, not a Match. If the groups will return unique matches, you can find its location by simply using the Instr function:

With CreateObject("VBScript.RegExp")
    .Pattern = Find
    If .Test(text) Then
        Set match = .Execute(text)
        Debug.Print InStr(1, text, match(0).SubMatches(0)) '0
        Debug.Print InStr(1, text, match(0).SubMatches(5)) '16
        'and so on
    End If
End With

If the groups will not return unique results, you can track the position of the last match and loop through the results. Note that VBScript.RegExp doesn't support look-behinds, so you don't have to take the length of the matches into account:

With CreateObject("VBScript.RegExp")
    .Pattern = find
    If .Test(text) Then
        Set match = .Execute(text)
        Dim i As Long, pos As Long, found As String
        pos = 1
        For i = 0 To match(0).SubMatches.Count - 1
            found = match(0).SubMatches(i)
            pos = InStr(pos, text, match(0).SubMatches(i)) 
            Debug.Print found, pos
        Next
    End If
End With
Comintern
  • 21,855
  • 5
  • 33
  • 80
  • 1
    What if there are multiple matches and the substring appears several times in the input string? – Wiktor Stribiżew Sep 14 '16 at 13:04
  • @WiktorStribiżew - That would match the behavior of `.FirstIndex`, although this does assume that the groups will return unique matches. – Comintern Sep 14 '16 at 13:09
  • Even FirstIndex will return only first occurrence of matched string? – Learning Sep 14 '16 at 13:21
  • @Learning - Yep. It returns a `Long`, so it can *only* return one results - it returns the first. – Comintern Sep 14 '16 at 13:23
  • Oh! So, If string has duplicates, which would be the effective way to get position of strings? – Learning Sep 14 '16 at 13:25
  • @Learning - It depends on the pattern. For example, `(foo)(bar)?` can never have duplicate submatches. If you're matching something like `foo` in "foobarfoobar", you'd get 2 *matches*, each with one submatch. Note that if you have nested groups, i.e. `foo(b(az|ar))`, you would need to make the inner groups non-capturing with this solution. – Comintern Sep 14 '16 at 13:42
4

The Submatches collection contains strings:

A SubMatches collection contains individual submatch strings, ...

Each item in the SubMatches collection is the string found and captured by the regular expression.

So you can't get the positions/indices.

Ekkehard.Horner
  • 38,498
  • 2
  • 45
  • 96