2

I am using a function to extract numbers from string with conditions that number with 8 digits and does not contain characters (. , @).
It works with 8 digits , but if the number is followed by characters (. , @) ,it also extract that number and that not required.
This my string 11111111 12345678.1 11111112 11111113 and the expected output is 11111111 11111112 11111113 without 12345678.1.
I tried to use negative Lookahead \d{8}(?!.,@) but it is useless.
Thanks all for your help.

Function Find8Numbers(st As String) As Variant
    Dim regex As New RegExp
     Dim matches As MatchCollection, mch As match
 
     regex.Pattern = "\d{8}"      'Look for variable length numbers only
     regex.IgnoreCase = True
     regex.Global = True
     regex.MultiLine = True
 
    If (regex.Test(st) = True) Then
      Set matches = regex.Execute(st)   'Execute search
        For Each mch In matches
              Find8Numbers = LTrim(Find8Numbers) & " " & mch.value
        Next
    End If
End Function
Leedo
  • 439
  • 5
  • 18
  • I would just use find() and not vba - lots of examples exist on here. – Solar Mike Aug 03 '22 at 06:26
  • @Solar Mike ,that function is just a piece on my vba code – Leedo Aug 03 '22 at 06:34
  • 1
    When working with regex it always pays to test your regex expression against a sample of text outside of the VBA environment. There are many sites on the internet that can do this test, e.g. https://www.freeformatter.com/regex-tester.html – freeflow Aug 03 '22 at 11:23
  • @freeflow ,thanks and that is a great info – Leedo Aug 03 '22 at 12:36
  • Actually, a regex tester is built into the free and fantastic Rubberduck addin for VBA. – freeflow Aug 03 '22 at 17:29

2 Answers2

4

In line with your question and current attempt, you could indeed use regex:

Function Find8Numbers(st As String) As String

With CreateObject("vbscript.regexp")
    .Pattern = "(?:^|\s)(\d{8})(?![.,@\d])"
    .Global = True
    If .Test(st) Then
        Set Matches = .Execute(st)
        For Each mch In Matches
            Find8Numbers = LTrim(Find8Numbers & " " & mch.submatches(0))
        Next
    End If
End With

End Function

Invoke through:

Sub Test()

Dim s As String: s = "11111111  12345678.1 11111112 11111113"
Debug.Print Find8Numbers(s)

End Sub

Prints:

11111111 11111112 11111113

Pattern used:

(?:^|\s)(\d{8})(?![.,@\d])

See an online demo

  • (?:^|\s) - No lookbehind in VBA thus used a non-capture group to match start-line anchor or whitespace;
  • (\d{8}) - Exactly 8 digits in capture group;
  • (?![.,@\d]) - Negative lookahead to assert position isn't followed by any of given characters including digits.
JvdV
  • 70,606
  • 8
  • 39
  • 70
  • 1
    Woww, this is very well explained and very informative answer, thanks for sharing, cheers. So vb script supports PCRE regex OR it follows POSIX one, just asking for info. – RavinderSingh13 Aug 03 '22 at 07:29
  • 1
    @RavinderSingh13 Thanks! Also, VBScript follows Perl-style regular expressions like JavaScript but lacks a bunch of features. See [here](https://stackoverflow.com/a/60192760/9758194). – JvdV Aug 03 '22 at 07:43
  • 1
    It works perfectly . I see also you changed `mch.value` with **mch.SubMatches(0)** , Please what is the difference ? – Leedo Aug 03 '22 at 08:27
  • 1
    @Leedo, it's a reference to the 1st capture group. In the pattern I wanted to exclude possible leading whitespace from the match since `Trim()` does not replace extra spaces inside string; I wanted to prevent using `Application.Trim()` thus opted for a capture group. – JvdV Aug 03 '22 at 08:29
  • @Leedo, you could revert back to work without a capture group, but `\s` is shorthand for any type of whitespace character. Meaning; you could have newline, tabs etc. before numbers and you want to return a space delimited string, hence my choice for a capture group. – JvdV Aug 03 '22 at 08:41
1

I'm not sure you need Regex for what is a reasonably simple pattern. You could just go with a VBA solution:

Public Function Find8Numbers(str As String) As String
    Dim c As String, c1 As String
    Dim i As Long, numStart As Long
    Dim isNumSeq As Boolean
    Dim result As String
    
    If Len(str) < 8 Then Exit Function
    
    For i = 1 To Len(str)
        c = Mid(str, i, 1)
        If i = Len(str) Then
            c1 = ""
        Else
            c1 = Mid(str, i + 1, 1)
        End If
        If c >= "0" And c <= "9" Then
            If isNumSeq Then
                If i - numStart + 1 = 8 Then
                    If c1 <> "." And c1 <> "," And c1 <> "@" Then
                        If result <> "" Then result = result & " "
                        result = result & Mid(str, numStart, 8)
                        isNumSeq = False
                    End If
                End If
            Else
                If i > Len(str) - 8 + 1 Then Exit For
                isNumSeq = True
                numStart = i
            End If
        Else
            isNumSeq = False
        End If
    Next
    Find8Numbers = result
End Function
Ambie
  • 4,872
  • 2
  • 12
  • 26