0

I am working in excel and need VBA code to extract 3 specific number patterns. In column A I have several rows of strings which include alphabetical characters, numbers, and punctuation. I need to remove all characters except those found in a 13-digit number (containing only numbers), a ten-digit number (containing only numbers), or a 9-digit number immediately followed by an "x" character. These are isbn numbers.

The remaining characters should be separated by one, and only one, space. So, for the following string found in A1: "There are several books here, including 0192145789 and 9781245687456. Also, the book with isbn 045789541x is included. This book is one of 100000000 copies."

The output should be: 0192145789 9781245687456 045789541x

Note that the number 100000000 should not be included in the output because it does not match any of the three patterns mentioned above.

I'm not opposed to a excel formula solution as opposed to VBA, but I assumed that VBA would be cleaner. Thanks in advance.

enstmagoo
  • 3
  • 4
  • Split the string using a space as the delimiter and then concatenate only the segments that start with a number. – Automate This Apr 17 '14 at 05:26
  • If you want regex, check out this [post](http://stackoverflow.com/q/22542834/2521004) which outlines several methods for using regex in VBA. – Automate This Apr 17 '14 at 05:27

1 Answers1

2

Here's a VBA function that will do specifically what you've specified

    Function ExtractNumbers(inputStr As String) As String

    Dim outputStr As String
    Dim bNumDetected As Boolean
    Dim numCount As Integer
    Dim numStart As Integer

    numCount = 0
    bNumDetected = False

    For i = 1 To Len(inputStr)
        If IsNumeric(Mid(inputStr, i, 1)) Then
            numCount = numCount + 1
            If Not bNumDetected Then
                bNumDetected = True
                bNumStart = i
            End If

            If (numCount = 9 And Mid(inputStr, i + 1, 1) = "x") Or _
                numCount = 13 And Not IsNumeric(Mid(inputStr, i + 1, 1)) Or _
                numCount = 10 And Not IsNumeric(Mid(inputStr, i + 1, 1)) Then

                    If numCount = 9 Then
                        outputStr = outputStr & Mid(inputStr, bNumStart, numCount) & "x "
                    Else
                        outputStr = outputStr & Mid(inputStr, bNumStart, numCount) & " "
                    End If

            End If
        Else
            numCount = 0
            bNumDetected = False
        End If

    Next i

    ExtractNumbers = Trim(outputStr)

End Function

It's nothing fancy, just uses string functions to goes through your string one character at a time looking for sections of 9 digit numbers ending with x, 10 digit numbers and 13 digit numbers and extracts them into a new string.

It's a UDF so you can use it as a formula in your workbook

Silenxor
  • 246
  • 2
  • 5