1

I have a large list (a table with one field) of non-standardized strings, imported from a poorly managed legacy database. I need to extract the single-digit number (surrounded by spaces) that occurs exactly once in each of those strings (though the strings have other multi-digit numbers sometimes too). For example, from the following string:

"Quality Assurance File System And Records Retention Johnson, R.M. 004 4 2999 ss/ds/free ReviewMo = Aug Effective 1/31/2012 FileOpen-?"

I would want to pull the number 4 (or 4's position in the string, i.e. 71)

I can use

WHERE rsLegacyList.F1 LIKE "* # *" 

inside a select statement to find if each string has a lone digit, and thereby filter my list. But it doesn't tell me where the digit is so I can extract the digit itself (with mid() function) and start sorting the list. The goal is to create a second field with that digit by itself as a method of sorting the larger strings in the first field.

Is there a way to use Instr() along with regular expressions to find where a regular expression occurs within a larger string? Something like

intMarkerLocation = instr(rsLegacyList.F1, Like "* # *")

but that actually works?

I appreciate any suggestions, or workarounds that avoid the problem entirely.


@Lee Mac, I made a function RegExFindStringIndex as shown here:

Public Function RegExFindStringIndex(strToSearch As String, strPatternToMatch As String) As Integer

    Dim regex                       As RegExp
    Dim Matching                    As Match

    Set regex = New RegExp

    With regex
        .MultiLine = False
        .Global = True
        .IgnoreCase = False
        .Pattern = strPatternToMatch
        Matching = .Execute(strToSearch)
        RegExFindStringIndex = Matching.FirstIndex
    End With

    Set regex = Nothing
    Set Matching = Nothing
End Function

But it gives me an error Invalid use of property at line Matching = .Execute(strToSearch)

braX
  • 11,506
  • 5
  • 20
  • 33
ErrorMeck
  • 43
  • 8

2 Answers2

1

Using Regular Expressions

If you were to use Regular Expressions, you would need to define a VBA function to instantiate a RegExp object, set the pattern property to something like \s\d\s (whitespace-digit-whitespace) and then invoke the Execute method to obtain a match (or matches), each of which will provide an index of the pattern within the string. If you want to pursue this route, here are some existing examples for Excel, but the RegExp manipulation will be identical in MS Access.

Here is an example function demonstrating how to use the first result returned by the Execute method:

Public Function RegexInStr(strStr As String, strPat As String) As Integer
    With New RegExp
        .Multiline = False
        .Global = True
        .IgnoreCase = False
        .Pattern = strPat
        With .Execute(strStr)
            If .Count > 0 Then RegexInStr = .Item(0).FirstIndex + 1
        End With
    End With
End Function

Note that the above uses early binding and so you will need to add a reference to the Microsoft VBScript Regular Expressions 5.5 library to your project.

Example Immediate Window evaluation:

?InStr("abc 1 123", " 1 ")
 4 
?RegexInStr("abc 1 123", "\s\w\s")
 4 

Using InStr

An alternative using the in-built instr function within a query might be the following inelegant (and probably very slow) query:

select
    switch
    (
        instr(rsLegacyList.F1," 0 ")>0,instr(rsLegacyList.F1," 0 ")+1,
        instr(rsLegacyList.F1," 1 ")>0,instr(rsLegacyList.F1," 1 ")+1,
        instr(rsLegacyList.F1," 2 ")>0,instr(rsLegacyList.F1," 2 ")+1,
        instr(rsLegacyList.F1," 3 ")>0,instr(rsLegacyList.F1," 3 ")+1,
        instr(rsLegacyList.F1," 4 ")>0,instr(rsLegacyList.F1," 4 ")+1,
        instr(rsLegacyList.F1," 5 ")>0,instr(rsLegacyList.F1," 5 ")+1,
        instr(rsLegacyList.F1," 6 ")>0,instr(rsLegacyList.F1," 6 ")+1,
        instr(rsLegacyList.F1," 7 ")>0,instr(rsLegacyList.F1," 7 ")+1,
        instr(rsLegacyList.F1," 8 ")>0,instr(rsLegacyList.F1," 8 ")+1,
        instr(rsLegacyList.F1," 9 ")>0,instr(rsLegacyList.F1," 9 ")+1,
        true, null
    ) as intMarkerLocation
from
    rsLegacyList
where 
    rsLegacyList.F1 like "* # *" 
Community
  • 1
  • 1
Lee Mac
  • 15,615
  • 6
  • 32
  • 80
  • I appreciate both of these, and the edits to the question, thank you! I will go try these ideas and report back. – ErrorMeck Jan 16 '20 at 14:36
  • question edited with my attempt to make a VBA function. It give me an _Invalid Use of Property_ error. – ErrorMeck Jan 16 '20 at 16:39
  • @ErrorMeck You're welcome - I have updated my above answer. – Lee Mac Jan 16 '20 at 18:56
  • Success! I found that I had to add 1 again, which is odd to me, but it might be that access vba is using a different index base than I thought it did. But I got this to work, thank you!! – ErrorMeck Jan 16 '20 at 19:23
  • @ErrorMeck You're most welcome - I'm glad it's working for you. You would indeed need to add another 1 to account for searching for the digit surrounded by whitespace (else you would receive the position of the whitespace). I included the addition of 1 within the function to emulate the return of `instr` which is one-based, whereas the `firstindex` property is zero-based. – Lee Mac Jan 16 '20 at 22:29
0

How about:

select
    instr(rsLegacyList.F1, " # ") + 1 as position
from rsLegacyList.F1
where rsLegacyList.F1 LIKE "* # *"
Bohemian
  • 412,405
  • 93
  • 575
  • 722
  • `Instr` Doesn't recognize regular expressions, so `Instr(rsLegacyList.F1, " # ")` looks for the literal string space-octothorpe-space. I do appreciate the suggestion tho, and the " + 1 " for position is very welcome, I hadn't thought of that at all. I would have extracted a bunch of blanks (spaces) and would be wondering why nothing worked. – ErrorMeck Jan 16 '20 at 14:52