24

I need to remove all non alphanumeric characters from a string except period and space in Excel. A solution using VBA rather than pure excel functions be just fine.

Community
  • 1
  • 1
xyz
  • 2,253
  • 10
  • 46
  • 68
  • See this question, except for period/space part. Your pattern would be like [A-Z,a-z,0-9,\.,\s] http://stackoverflow.com/questions/10789948/remove-non-numeric-characters-from-excel – Alexander Mar 30 '13 at 21:38

5 Answers5

45

Insert this function into a new module in the Visual Basic Editor:

Function AlphaNumericOnly(strSource As String) As String
    Dim i As Integer
    Dim strResult As String

    For i = 1 To Len(strSource)
        Select Case Asc(Mid(strSource, i, 1))
            Case 48 To 57, 65 To 90, 97 To 122: 'include 32 if you want to include space
                strResult = strResult & Mid(strSource, i, 1)
        End Select
    Next
    AlphaNumericOnly = strResult
End Function

Now you can use this as a User Define Function, i.e. if your data is in cell A1, place this formula in an empty cell =AlphaNumericOnly(A1).

If you want to convert a large range directly, i.e. replace all the non-alphanumeric characters without leaving the source, you can do this with another VBA routine:

Sub CleanAll()
    Dim rng As Range

    For Each rng In Sheets("Sheet1").Range("A1:K1500").Cells 'adjust sheetname and range accordingly
        rng.Value = AlphaNumericOnly(rng.Value)
    Next
End Sub

Simply place this sub in the same module and execute it. Be aware though, that this will replace any formulas in the range.

Peter Albert
  • 16,917
  • 5
  • 64
  • 88
  • Albert, that was fast, thank you. I will try it tomorrow. As I said I know little about excel could you explain how to use the function I need to run it across my entire excel sheet. This will usually have 11 columns and about 1500 rows but this can very. Thank you this is much appreciated – xyz Mar 30 '13 at 21:43
  • 5
    For anyone who wants to add or subtract Characters from the above to tool it to there needs here is a reference sheet: http://office.microsoft.com/en-us/excel-help/ascii-character-chart-HA001133136.aspx – xyz Mar 31 '13 at 14:58
  • 2
    +1 Good point! Alternatively, I simply used `? Asc("A")` in the debug console to get the ASCII code... – Peter Albert Mar 31 '13 at 21:37
  • 3
    The link Tim provided is no longer functional. Here's a new one: [link](http://www.asciitable.com/) – Josh Fierro Oct 05 '15 at 17:39
14

I was looking for a more elegant solution than the one I came up with. I was going to use ashleedawg's code above as it certainly is neater than my code. Ironically, mine ran 30% quicker. If speed is important (say you have a few million to do), try this:

Public Function AlphaNumeric(str As String) As String
    Dim i As Long
    
    For i = 1 To Len(str)
        If InStr(1, "01234567890ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz. ", Mid(str, i, 1)) Then AlphaNumeric = AlphaNumeric & Mid(str, i, 1)
    Next
End Function

There's a surprise around every corner with VBA. I'd never imagine this would be quicker...

GWD
  • 3,081
  • 14
  • 30
Nigel Foster
  • 193
  • 1
  • 8
11

Here' an alternate method of removing "whatever characters you want" from a string using pattern matching.

  • The example below removes everything except letters, numbers, spaces and periods ([A-Z.a-z 0-9])

  • For improved efficiency it also utilizes VBA's seamless conversion between Strings and Byte Arrays:

cleanString Function:

Function cleanString(str As String) As String
    Dim ch, bytes() As Byte: bytes = str
    For Each ch In bytes
        If Chr(ch) Like "[A-Z.a-z 0-9]" Then cleanString = cleanString & Chr(ch)
    Next ch
End Function

More Information:

ashleedawg
  • 20,365
  • 9
  • 72
  • 105
  • 2
    Nice ! This is the most elegant and probably the fastest one. If speed is an issue, I would try `For i =...` vs `For each...` – iDevlop Jan 04 '19 at 13:11
  • Would using a for loop vs foreach really be reasonably different in performance? – Craig_VG Dec 22 '21 at 20:18
  • Fyi Though a late response to your valuable post+:), you might be interested in my alternate function [`AlphaNum()`](https://stackoverflow.com/questions/15723672/how-to-remove-all-non-alphanumeric-characters-from-a-string-except-period-and-sp/74679416#74679416) *grouping* characters into categories which allows to include even ►accented and ►diacritic characters by a string search shift. @ashleedawg – T.M. Dec 05 '22 at 19:19
2

I have written the following code and it works as far as I tested it, it consists of two functions. The first checks whether a string is alphanumeric and the second makes the replacement (it also removes spaces)

Public Function Isalphanumeric(cadena As String) As Boolean

    Select Case Asc(UCase(cadena))
        Case 65 To 90 'letras
            Isalphanumeric = True
        Case 48 To 57 'numeros
            Isalphanumeric = True
        Case Else
            Isalphanumeric = False

    End Select

End Function

And here goes the remove function

Function RemoveSymbols_Enhanced(InputString As String) As String

 Dim InputString As String
 Dim CharactersArray()
 Dim i, arrayindex, longitud As Integer
 Dim item As Variant


 i = 1
 arrayindex = 0
 longitud = Len(InputString)

'We create an array with non alphanumeric characters
 For i = 1 To longitud

  If Isalphanumeric(Mid(InputString, i, 1)) = False Then
    ReDim Preserve CharactersArray(arrayindex)
    CharactersArray(arrayindex) = Mid(InputString, i, 1)
    arrayindex = arrayindex + 1

  End If

  Next

 'For each non alphanumeric character we do a replace
 For Each item In CharactersArray
  item = CStr(item)
  InputString = Replace(InputString, item, "")
 Next


End Function
Andoni
  • 89
  • 10
1

Get alphanumeric characters including spaces, +- signs and (point)commata

The tricky helper function getCodes() called by AlphaNum() groups each character into five categories, where

  • category 4 corresponds to digits,
  • 5 to any letters, not only from A-Z, but including even ►accented or ►diacritic ones.

By looping to the returned codes array you are in the position to get only the relevant alphanum characters or allowed signs.

Function AlphaNum(ByVal s As String, Optional info As Boolean = False) As String
'a) group characters into code categories
    Dim codes: codes = getCodes(s, info)
'b) check codes returning only alpha nums
    Dim i As Long, ii As Long
    For i = 1 To UBound(codes)
        Dim char As String: char = Mid$(s, i, 1)
        Dim okay As Boolean: okay = False
        Select Case codes(i)
        '   AlphaNum: 4=digits, 5=letters
            Case Is >= 4: okay = True
        '   other characters
            Case 2          ' allowing space, minus or comma
                If InStr(" ,-", char) <> 0 Then okay = True
            Case 3          ' allowing plus or point
                If InStr(".+", char) <> 0 Then okay = True
        End Select
        If okay Then ii = ii + 1: codes(ii) = char
    Next i
    ReDim Preserve codes(1 To ii)
    AlphaNum = Join(codes, vbNullString)
End Function

Helper functions

Function Char2Arr(ByVal s As String)
'Purp.: assign single characters to array
    s = StrConv(s, vbUnicode)
    Char2Arr = Split(s, vbNullChar, Len(s) \ 2)
End Function
Function getCodes(s, Optional info As Boolean = False)
'Purp.: group characters into five categories
    Const CATEG As String = "' - . 0 A"
    Dim arr:   arr = Char2Arr(s)
    Dim chars: chars = Split(CATEG)
    getCodes = Application.Match(arr, chars)  'No 3rd zero-argument!!
    'optional display in immediate window
    If info Then
        Debug.Print Join(arr, "|")
        Debug.Print Join(getCodes, "|")
    End If
End Function

Example call

    Dim s As String
    s = "Alpha, -8.9 +äæçñöüéêëÿ'!$""#$%&()*/:;<=>?@|¶"

    Debug.Print "~~> " & AlphaNum(s, info:=True)    

Display in VB editor's immediate window

A|l|p|h|a|,| |-|8|.|9| |+|ä|æ|ç|ñ|ö|ü|é|ê|ë|ÿ|'|!|$|"|#|$|%|&|(|)|*|/|:|;|<|=|>|?|@|||¶ 
5|5|5|5|5|2|2|2|4|3|4|2|3|5|5|5|5|5|5|5|5|5|5|1|2|2|2|2|2|2|2|2|2|2|3|3|3|3|3|3|3|3|3|3
Alpha, -8.9 +äæçñöüéêëÿ

T.M.
  • 9,436
  • 3
  • 33
  • 57