1

I have a string in a cell composed of several shorter strings of various lengths with blank spaces and commas in between. In some cases only one or more blanks are in between.

enter image description here

I want to remove every blank space and comma and only leave behind 1 comma between each string element. The result must look like this:

enter image description here

The following doesn't work. I'm not getting an error but the strings are truncated at the wrong places. I don't understand why.

Sub String_adaption()

Dim i, j, k, m As Long
Dim STR_A As String

STR_A = "01234567890ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz"

i = 1

With Worksheets("table")

  For m = 1 To Len(.Range("H" & i))
  
      j = 1
      
    Do While Mid(.Range("H" & i), m, 1) = "," And Mid(.Range("H" & i), m - 1, 1) <> Mid(STR_A, j, 1) And m <> Len(.Range("H" & i))

            .Range("H" & i) = Mid(.Range("H" & i), 1, m - 2) & Mid(.Range("H" & i), m, Len(.Range("H" & i)))

            j = j + 1
    Loop
             
  Next m

End With

End Sub
D3merzel
  • 63
  • 10
  • Are you getting an error or something? What is the question? – braX Jan 11 '23 at 08:51
  • I'm not getting an error but the strings are truncated at the wrong places. I don't understand why. – D3merzel Jan 11 '23 at 08:57
  • Use the VBA REPLACE function https://learn.microsoft.com/en-us/office/vba/language/reference/user-interface-help/replace-function – user10186832 Jan 11 '23 at 08:58
  • @user10186832 That's not going to work in this case. – D3merzel Jan 11 '23 at 08:59
  • Welcome D3merzel! Please take some time to read the introduction to Stack Overflow and earn your next badge. https://stackoverflow.com/tour – user10186832 Jan 11 '23 at 09:00
  • Please never post data or code as an image, thank you! – Tom Brunberg Jan 11 '23 at 09:21
  • Do the shorter strings have spaces or commas in them ? – CDP1802 Jan 11 '23 at 09:28
  • @CDP1802 Blanks are possible, commas not. – D3merzel Jan 11 '23 at 09:30
  • 1
    @D3merzel: When shorter Strings can have spaces: Does this mean `String 1` will be kept as one string or should it result in `String,1`? – FunThomas Jan 11 '23 at 10:05
  • 1
    @D3merzel - your answer to above comment would be of great help for all existing contributors: meanwhile a) CDP1802 's and FunThomas 'es answers result in: `STRING 22,STRING 1,STRING 333,STRING_22 STRING_44` **whereas** b) FunThomas' and Ron 's answers result in `STRING,22,STRING,1,STRING,333,STRING_22,STRING_44`. ` – T.M. Jan 11 '23 at 19:31

5 Answers5

3

I'd use a regular expression to replace any combination of spaces and comma's. Something along these lines:

Sub Test()

Dim str As String: str = "STRING_22   ,,,,,STRING_1 ,  ,  ,,,,,STRING_333   STRING_22 STRING_4444"
Debug.Print RegexReplace(str, "[\s,]+", ",")

End Sub

Function RegexReplace(x_in, pat, repl) As String

With CreateObject("vbscript.regexp")
    .Global = True
    .Pattern = pat
    RegexReplace = .Replace(x_in, repl)
End With

End Function

Just for the sake of alternatives:

enter image description here

Formula in B1:

=TEXTJOIN(",",,TEXTSPLIT(A1,{" ",","}))
JvdV
  • 70,606
  • 8
  • 39
  • 70
2

The following function will split the input string into pieces (words), using a comma as separator. When the input string has multiple commas, it will result in empty words.
After splitting, the function loops over all words, trims them (remove leading and trailing blanks) and glue them together. Empty words will be skipped.

I have implemented it as Function, you could use it as UDF: If your input string is in B2, write =String_adaption(B2) as Formula into any cell.

Function String_adaption(s As String) As String
    ' Remove duplicate Commas and Leading and Trailing Blanks from words
    Dim words() As String, i As Long
    words = Split(s, ",")
    For i = 0 To UBound(words)
        Dim word As String
        word = Trim(words(i))
        If word <> "" Then
            String_adaption = String_adaption & IIf(String_adaption = "", "", ",") & word
        End If
    Next i
End Function

P.S.: Almost sure that this could be done with some magic regular expressions, but I'm not an expert in that.

FunThomas
  • 23,043
  • 3
  • 18
  • 34
  • Just an headsup but you'r leaving the spaces between strings untouched. E.g: 'STRING_333 STRING_22'. – JvdV Jan 11 '23 at 09:53
  • True. Saw the comment of OP that words may contain spaces and therefore concluded that spaces don't split a string. However, having a closer look to the example in the question, I saw that it *does*. So I guess OP needs to clarify first what is really needed. – FunThomas Jan 11 '23 at 10:00
2

If you have recent Excel version, you can use simple worksheet function to split the string on space and on comma; then put it back together using the comma deliminater and ignoring the blanks (and I just noted @JvdV had previously posted the same formula solution):

=TEXTJOIN(",",TRUE,TEXTSPLIT(A1,{" ",","}))

In VBA, you can use a similar algorithm, using the ArrayList object to collect the non-blank results.

Option Explicit

Function commaOnly(s As String) As String
    Dim v, w, x, y
    Dim al As Object
    
Set al = CreateObject("System.Collections.ArrayList")

v = Split(s, " ")
For Each w In v
    x = Split(w, ",")
    For Each y In x
        If y <> "" Then al.Add y
    Next y
Next w

commaOnly = Join(al.toarray, ",")
    
End Function

Ron Rosenfeld
  • 53,870
  • 7
  • 28
  • 60
1

This preserves the spaces within the smaller strings.

Option Explicit
Sub demo()
     Const s = "STRING 22,,,,   ,,STRING  1,,,,  ,,STRING  333 , , ,  STRING_22 STRING_44"
     Debug.Print Cleanup(s)
End Sub

Function Cleanup(s As String) As String

    Const SEP = ","
    Dim regex, m, sOut As String, i As Long, ar()
    Set regex = CreateObject("vbscript.regexp")
    With regex
        .Global = True
        .MultiLine = False
        .IgnoreCase = True
        .Pattern = "([^,]+)(?:[ ,]*)"
    End With
    
    If regex.Test(s) Then
    
        Set m = regex.Execute(s)
        ReDim ar(0 To m.Count - 1)
        For i = 0 To UBound(ar)
           ar(i) = Trim(m(i).submatches(0))
        Next
        
    End If
    Cleanup = Join(ar, SEP)

End Function
CDP1802
  • 13,871
  • 2
  • 7
  • 17
1

Code categories approach

For the sake of completeness and to show also other ways "leading to Rome", I want to demonstrate an approach allowing to group the string input into five code categories in order to extract alphanumerics by a tricky match (see [B] Function getCats()):

To meet the requirements in OP use the following steps:

  • 1) remove comma separated tokens if empty or only blanks (optional),
  • 2) group characters into code categories,
  • 3) check catCodes returning alpha nums including even accented or diacritic letters as well as characters like [ -,.+_]
Function AlphaNum(ByVal s As String, _
                  Optional IgnoreEmpty As Boolean = True, _
                  Optional info As Boolean = False) As String
'Site:  https://stackoverflow.com/questions/15723672/how-to-remove-all-non-alphanumeric-characters-from-a-string-except-period-and-sp/74679416#74679416
'Auth.: https://stackoverflow.com/users/6460297/t-m
'Date:  2023-01-12
'1) remove comma separated tokens if empty or only blanks (s passed as byRef argument)
    If IgnoreEmpty Then RemoveEmpty s                    ' << [A] RemoveEmpty
'2) group characters into code categories
    Dim catCodes: catCodes = getCats(s, info)            ' << [B] getCats()
'3) check catCodes and return alpha nums plus chars like [ -,.+_]
    Dim i As Long, ii As Long
    For i = 1 To UBound(catCodes)
        ' get current character
        Dim curr As String: curr = Mid$(s, i, 1)
        Dim okay As Boolean: okay = False
        Select Case catCodes(i)
        '   AlphaNum: cat.4=digits, cat.5=alpha letters
            Case Is >= 4: okay = True
        '   Category 2: allow only space, comma, minus
            Case 2: If InStr(" -,", curr) <> 0 Then okay = True
        '   Category 3: allow only point, plus, underline
            Case 3: If InStr(".+_", curr) <> 0 Then okay = True
        End Select
        If okay Then ii = ii + 1: catCodes(ii) = curr   ' increment counter
    Next i
    ReDim Preserve catCodes(1 To ii)
    AlphaNum = Join(catCodes, vbNullString)
End Function

Note: Instead of If InStr(" -,", curr) <> 0 Then in Case 2 you may code If curr like "[ -,]" Then, too. Similar in Case 3 :-)

[A] Helper procedure RemoveEmpty

Optional clean-up removing comma separated tokens if empty or containing only blanks:

Sub RemoveEmpty(ByRef s As String)
'Purp:  remove comma separated tokens if empty or only blanks
    Const DEL = "$DEL$"             ' temporary deletion marker
    Dim i As Long
    Dim tmp: tmp = Split(s, ",")
    For i = LBound(tmp) To UBound(tmp)
        tmp(i) = IIf(Len(Trim(tmp(i))) = 0, DEL, Trim(tmp(i)))
    Next i
    tmp = Filter(tmp, DEL, False)   ' remove marked elements
    s = Join(tmp, ",")
End Sub

[B] Helper function getCats()

A tricky way to groups characters into five code categories, thus building the basic logic for any further analyzing:

Function getCats(s, Optional info As Boolean = False)
'Purp.: group characters into five code categories
'Auth.: https://stackoverflow.com/users/6460297/t-m
'Site:  https://stackoverflow.com/questions/15723672/how-to-remove-all-non-alphanumeric-characters-from-a-string-except-period-and-sp/74679416#74679416
'Note:  Cat.: including:
'       1 ~~> apostrophe '
'       2 ~~> space, comma, minus  etc
'       3 ~~> point separ., plus   etc
'       4 ~~> digits 0..9
'       5 ~~> alpha (even including accented or diacritic letters!)
'~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
'a) get array of single characters
    Const CATEG As String = "' - . 0 A"      'define group starters (case indep.)
    Dim arr:   arr = Char2Arr(s)             ' << [C] Char2Arr()
    Dim chars: chars = Split(CATEG)
'b) return codes per array element
    getCats = Application.Match(arr, chars)  'No 3rd zero-argument!!
'c) display in immediate window (optionally)
    If info Then Debug.Print Join(arr, "|") & vbNewLine & Join(getCats, "|")
End Function

[C] Helper function Char2Arr

Assigns every string character to an array:

Function Char2Arr(ByVal s As String)
'Purp.: assign single characters to array
    s = StrConv(s, vbUnicode)
    Char2Arr = Split(s, vbNullChar, Len(s) \ 2)
End Function
T.M.
  • 9,436
  • 3
  • 33
  • 57