1

I need to extract a 3-digit number from text strings in an excel sheet. I've written my custom function to return a substring based on a pattern, that's fine, it works. It's the pattern that's beating me. So, examples could include:

  • Lorem ipsum dolor cc123 sit amet
  • Lorem ipsum dolor sit amet cc 123
  • Lorem ipsum c 123, dolor sit amet

I'm no expert in regex, but usually I'd use (?<=\D)\d{3}\b That works with all the tests I've tried but VBA doesn't seem to support (?<=x). Using just \D\d{3}\b and trimming the leading char works, but offends my senses.

I'm using the standard VBscript Regular Expressions 5.5 library.

The practical background to this is that I have a large worksheet that has a comments field on each line. Policy has been that users include their cost centre somewhere in this field. It can appear anywhere in the field, it's always a three-digit number, it is always followed by a word boundary and it is always preceded by a non-number char.

PerryW
  • 1,426
  • 1
  • 15
  • 25
  • Excel (VB) doesn't support [positive lookbehind](https://stackoverflow.com/questions/9150552/lookbehind-on-regex-for-vba). There's a wee function [here](https://stackoverflow.com/questions/47662112/simulate-a-general-positive-lookbehind-for-vba-regex) that simulates the functionality in VB. – neophlegm Dec 08 '17 at 00:25
  • BTW, I know there could be lots of exception cases - I'll deal with them. I also know that I could approach this through string manipulation - I just have a tried and trusted regex function and I'd like to use that if possible – PerryW Dec 08 '17 at 00:25
  • @neophlegm Yes... that looks like it would work. I'll have a play – PerryW Dec 08 '17 at 00:28
  • 1
    A pure-regex workaround could be to write something like this and then only return the first capture group: `\D(\d{3}\b)` – neophlegm Dec 08 '17 at 00:28

2 Answers2

0

you can go for a backreference solution:

\D(\d{3})\b 

if you are interested only in the 3 digits, you could use the regex described here-over and reuse the first capturing group \1 (might be $1 in VB)

Allan
  • 12,117
  • 3
  • 27
  • 51
0

My best guess is to use : \D(\d{3})\b like in

Dim myRegExp, ResultString, myMatches, myMatch As Match
Dim myRegExp As RegExp
Set myRegExp = New RegExp
myRegExp.Pattern = "\D(\d{3})\b"
Set myMatches = myRegExp.Execute(SubjectString)
If myMatches.Count >= 1 Then
    Set myMatch = myMatches(0)
    If myMatch.SubMatches.Count >= 1 Then
        ResultString = myMatch.SubMatches(1-1)
    Else
        ResultString = ""
    End If
Else
    ResultString = ""
End If
Andie2302
  • 4,825
  • 4
  • 24
  • 43