-1

I'm looking for a regular expression to eliminated punctuation and special characters. Is there a way to get a regular expression to match \W and ignore the white spaces?

I have a VBA script that's working just find (I'm happy to share if necessary), but I think I'm just missing something simple. When I run the code, it definitely works, but the strPattern = "\W" is grabbing spaces as well. I know I can use the \S option, but I can't seem to get it to perform an AND, so to speak.

Code:

Sub RegEx_Pattern_Removal()
    Dim strPattern As String: strPattern = "\W"
    Dim strReplace As String: strReplace = ""
    Dim regEx As New RegExp
    Dim strInput As String
    Dim myRange As Range

    'Application.ScreenUpdating = False

    Set myRange = Application.Selection

    With regEx
        .Global = True
        .MultiLine = True
        .IgnoreCase = False
        .Pattern = strPattern
    End With

    For Each Cell In myRange
        If Cell.Value <> Empty Then
            strInput = Cell.Value
            If regEx.Test(strInput) = True Then
                Cell.Value = regEx.Replace(strInput, strReplace)
                MsgBox "It's a match, baby!"
            Else
                MsgBox "No Match! Or... you fucked up"
            End If
        End If
    Next

    'Application.ScreenUpdating = True

End Sub
dnlarralde
  • 77
  • 1
  • 2
  • 12
  • Please show us your code, include sample data... see [mcve]. – Mathieu Guindon Jun 04 '18 at 03:05
  • not quite `[^\w\s]` – Slai Jun 04 '18 at 03:12
  • @Slai - That's what I would think, but it's still removing the white spaces. I was under the impression that \W shouldn't be removing the white spaces anyway, but I'm new to regular expressions. – dnlarralde Jun 04 '18 at 03:15
  • [^A-Za-z0-9\s]* nearly works, but there is one line where a single space is removed. – dnlarralde Jun 04 '18 at 03:20
  • https://stackoverflow.com/questions/28617616/how-do-i-isolate-a-space-using-regexp-in-vba-s-vs-pzs – Slai Jun 04 '18 at 03:23
  • Please post (copy/paste) the exact text you have into the question body (no images) and add the expected output. – Wiktor Stribiżew Jun 04 '18 at 05:10
  • @WiktorStribiżew the initial question wasn't about code not working correctly so much as not being able to figure out the right pattern for the goal. I apologize if I didn't explain the goal well enough but it's resolved. Thanks anyways. – dnlarralde Jun 04 '18 at 05:16

1 Answers1

-1

@slai for props on the link

This is one of the cleaner patterns to use:

strPattern = "[^\dA-Za-z \xa0]*"

For the "mystery space" that persisted, it was in fact a hard space which can be identified through hexadecimal format as \xa0, seen at the end of the pattern.

This pattern did exactly what i need.

dnlarralde
  • 77
  • 1
  • 2
  • 12