0

I need the group values instead of the matches. This is how i tried to get those:

    Dim item as Variant, matches As Object, match As Object, subMatch As Variant, subMatches(), row As integer

    row = 1

    For Each item In arr

        With regex

            .Pattern = "\bopenfile ([^\s]+)|\bopen file ([^\s]+)"
            Set matches = regex.Execute(item)

    For Each match In matches
        For Each subMatch In match.subMatches
            subMatches(i) = match.subMatches(i)
            ActiveSheet.Range("A" & row).Value = subMatches(i)
            row = row + 1
            i = i + 1
        Next subMatch
    Next match
            
        End With
            
    Next item

This is the text from where it should be extracted:

Some help would be great :)

Open File file.M_p3_23432e done
Openfile file.M_p4_6432e done
Open File file.M_p3_857432 done
Open File file.M_p4_34892f done
Openfile file.M_p3_781 done

Info: I'm using Excel VBA.. If that is important to know.

lemurdroid
  • 41
  • 5
  • Did you try using a single capture group, and match the optional space? `\bopen ?file ([^\s]+)` – The fourth bird Sep 09 '22 at 09:19
  • Already tried it – lemurdroid Sep 09 '22 at 09:25
  • Should it be like this for the capture group 1 value? `subMatches(1)` So you loop all the matches, and per match get the `subMatches(1)` value. – The fourth bird Sep 09 '22 at 09:28
  • 1
    Plase, edit your question and show us the string where from the above code extracts matches. – FaneDuru Sep 09 '22 at 09:38
  • @FaneDuru done... – lemurdroid Sep 09 '22 at 10:16
  • "done", but not so clear related to the posted string. Your code processes an array. Is he above sample string as it looks, composed of many lines and the array in discussion is obtained by splitting it by `vbCrlf`, or how? I tried that and it does not return anything... How do you load the array in discussion? – FaneDuru Sep 09 '22 at 10:59
  • I would simplify your regex to `"\bopen\s?file (\S+)"` or maybe even "\bopen ?file (\S+)" and then just cycle through `subMatches(1)`. In your current regex, you would have to check **both** submatches 1 and 2. – Ron Rosenfeld Sep 09 '22 at 11:02
  • Also, I am assuming your problem is with the regex and the matches, not the actual vba code. If that is not the case, please clarify. – Ron Rosenfeld Sep 09 '22 at 11:08
  • @RonRosenfeld yes, the problem is not with the code – lemurdroid Sep 12 '22 at 06:22

1 Answers1

2

You can revamp the regex to match and capture with one capturing group:

\bopen\s?file\s+(\S+)

See the regex demo.

Details:

  • \b - word boundary
  • open - a fixed word
  • \s? - an optional whitespace
  • file - a fixed word
  • \s+ - one or more whitespaces
  • (\S+) - Group 1: one or more non-whitespaces.

Now, the file names are always in SubMatches(0).

Note that the regex must be compiled with the case insensitive option and global (if the string contains multiple matches):

With regex
    .Pattern = "\bopen\s?file\s+(\S+)"
    .IgnoreCase = True
    .Global = True
End With
Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563