0

How can I build an array if I have multiple delimiters, where some delimiters are single character and others are multiple characters?

Sub Example()
    Dim exString As String
    Dim myString() As String

    exString = "A,B C;D > E"

    myString() = Split(exString, "," & " " & ";" & " > ")
End Sub

The result I want in my array is:

myString(0) is A
myString(1) is B
myString(2) is C
myString(3) is D
myString(4) is E

But using Split() in this way doesn't work. I do know that I can use Replace() to replace every single delimiter with a common one, but I have a lot of different delimiters and variations of multiple character delimiters. Using Replace() isn't desirable to me. What can I do?

Zev Spitz
  • 13,950
  • 6
  • 64
  • 136
Tyeler
  • 1,088
  • 1
  • 12
  • 26
  • You are stuck using replace(), or if you really don't want that then you can parse character-by-character and that's going to be even *more* work. Or just wrap this up in a function and move on. If you have mixtures of multi- and single- character delimiters make sure to Replace the longer strings first. – Tim Williams Oct 05 '16 at 20:35
  • @TimWilliams I did just that. I made this to answer someone else's question and it worked so well, and I couldn't find anything as simple anywhere else, so I posted this as a simple "answer my own question" – Tyeler Oct 05 '16 at 20:37
  • Aha! - i hadn't twigged it was your answer! – Tim Williams Oct 05 '16 at 20:42
  • @ThomasInzina I think I see what you're talking about. As far as the multiple delimiter portion goes, I just thought of an example that would break the code. For that, I take back my snarkyness. – Tyeler Oct 06 '16 at 01:02
  • I get your point about answering your own question. Personally, I think that code review might be a better forum for this. –  Oct 06 '16 at 01:12
  • @ThomasInzina I didn't understand what you had meant that it couldn't do multi-character delimiters until I tried using it in a different way (i tried using it on code and delimiting the word "For" and what do you know it split the code up with "f", "o", and "r" as delimiters..). I've updated the function to include an actual multi-character delimiter option. – Tyeler Oct 06 '16 at 12:49

5 Answers5

2

You can have lots of problems in VBA as well:

'Add a reference to Microsoft VBScript Regular Expressions 5.5 (Tools -> References...)

Dim exString As String
exString = "A,B C;D > E"

Dim re As New RegExp
re.Pattern = "(,| |;|>)+"
re.Global = True

Dim myString() As String
myString = Split(re.Replace("A,B C;D > E", ","), ",")

Setting re.Pattern defines what to look for. | represents finding A or B, so the regular expression will match on , or or ; or >.

Multiple instances should be treated as one (e.g. between the D and the E there are three characters, but there should be only one split), so add a + at the end (and wrap everything else in ()).

Replace then replaces any of the matched patterns with , and gives back a string like this:

A,B,C,D,E

on which we can simply call Split to get back the array.


Instead of using regular expressions to match the delimiter characters, you could use regexes to match the non-delimiter characters:

Dim re As New RegExp
re.Pattern = "[^, ;>]+"   'The ^ unmatches any characters within the []
re.Global = True

Dim match As Match
For Each match In re.Execute(exString)
    'do something with each result here
    Debug.Print match.Value
Next

This is sufficient if all you need is to iterate over the results and do something with them. If you specifically need an array with the results:

Dim re As New RegExp
re.Pattern = "[^, ;>]+"
re.Global = True

Dim matches As MatchCollection
Set matches = re.Execute(exString)
ReDim myString(matches.Count) As String
Dim i As Integer
For i = 0 To matches.Count - 1
    myString(i) = matches(i).Value
Next
Community
  • 1
  • 1
Zev Spitz
  • 13,950
  • 6
  • 64
  • 136
  • You could use a greedy negated character class of delimiters for the pattern; then create a match collection which you can dump into an array – Ron Rosenfeld Oct 05 '16 at 21:57
  • @RonRosenfeld It's a little more complicated, because the VBScript regex object model doesn't allow you to assign the matches collection into an array; it involves iterating over the matches. Nevertheless, I've updated accordingly. – Zev Spitz Oct 05 '16 at 22:13
  • @ZevSpitz This is a really cool way of doing. I admit I really don't know anything about regular expressions. I've never used them before and nothings directed me to go this route. I'm definitely going to look into it, thanks for the great answer! – Tyeler Oct 06 '16 at 00:56
  • @Tyeler Updated to show how to get an array from a `MatchCollection`. – Zev Spitz Oct 06 '16 at 16:35
1

In this situation, I found the following function to be perfect for my needs:

Function MultiSplit(SourceText As String, Optional SingleCharDelimiter As String, Optional MultiCharDelimiter As String, _
    Optional Separator As String) As String()
'Created by Tyeler for use by all.
'SourceText is your input string.
'SingleCharDelimiter is a string of desired delimiters.
'SingleCharDelimiter format is a string fully concatenated with no character separation.
'  (ex. "-.;:, " MultiSplit will use those 6 characters as delimiters)
'SingleCharDelimiter's will remove blanks from the array in the event two single delimiters
'  are next to each other.
'MultiCharDelimiter is a string of specific multi-character delimiters.
'MultiCharDelimiters can be separated by the optional Separator
'Separator is an optional value used to separate multiple MultiCharDelimiters.
'  (ex. MultiCharDelimiter = "A A,B B,C C" // Separator = "," // This will make the function
'    delimit a string by "A A", "B B", and "C C")
'MultiSplit will make an array based on any delimiter (Including delimiters with
'  multiple characters).


    If MultiCharDelimiter = "" And SingleCharDelimiter = "" Then Exit Function
    Dim i As Integer, n As Integer, dlimit
    Dim delColl As New Collection
    Dim newString As String: newString = SourceText
    Dim delArr() As String, strgArr() As String, delFull() As String
    Dim delSep As String, a As Integer: a = 33

    Do While InStr(SingleCharDelimiter, Chr(a)) <> 0 Or InStr(MultiCharDelimiter, Chr(a)) <> 0 _
        Or InStr(Separator, Chr(a)) <> 0 Or InStr(SourceString, Chr(a)) <> 0 'Find intermediate delimiter
            a = a + 1
    Loop
    delSep = Chr(a)

    If MultiCharDelimiter <> "" Then
        If Separator <> "" Then 'If there's no delimiter for the delimiter array, assume MultiCharDelimiter is the delimiter
            delArr() = Split(MultiCharDelimiter, Separator)
            For i = 0 To UBound(delArr)
                If InStr(newString, delArr(i)) <> 0 Then newString = Replace(newString, delArr(i), delSep)
            Next i
        Else
            newString = Replace(newString, MultiCharDelimiter, delSep)
        End If
    End If
    Erase delArr

    For i = 1 To Len(SingleCharDelimiter) 'Build a collection of user defined delimiters
        delColl.Add Mid(SingleCharDelimiter, i, 1)
    Next i

    For Each dlimit In delColl 'Replace all delimiters in the string with a single common one
        newString = Replace(newString, dlimit, delSep)
    Next dlimit

    strgArr() = Split(newString, delSep)
    ReDim delFull(LBound(strgArr) To UBound(strgArr))
    n = LBound(strgArr)

    For i = LBound(strgArr) To UBound(strgArr) 'Get rid of empty array items
        If strgArr(i) <> "" Then
            delFull(n) = strgArr(i)
            n = n + 1
        End If
    Next i

    n = n - 1
    ReDim Preserve delFull(LBound(strgArr) To n)
    MultiSplit = delFull 'Send the delimited array
    Erase delFull
    Erase strgArr
End Function

This function will return an array of values that were separated by user defined delimiters.

To use this function, simply call on it and supply your full string and desired delimiters:

Sub Example1()
    Dim exString As String
    Dim myString() As String
    Dim c, n

    exString = "A,B C;D > E"

    myString() = MultiSplit(exString, ", ;", " > ")
    n = 0
    For Each c In myString
        Debug.Print "(" & n & ") = " & c
        n = n + 1
    Next c
End Sub

This will yield the desired result where the array is filled with only ABCDE.

enter image description here

A more complicated example:

Sub Example2()
    Dim myString As String, c, n

    n = 0
    myString = "The,Quickupside-downBrownjelloFox_Jumped[Over]             ThegiantLazyjelloDog"

    For Each c In MultiSplit(myString, ",_[] ", "upside-down,jello,giant", ",")
        Debug.Print "(" & n & ") = " & c
        n = n + 1
    Next c
End Sub

This will yield the following:

enter image description here

Tyeler
  • 1,088
  • 1
  • 12
  • 26
  • This is simply a replace of all the delimiters for a common one. Of course its done very nicely but still as per the OP requirements isn't a viable solution. Though turning the problem in my head I don't see any other viable way of doing it. – nbayly Oct 05 '16 at 20:32
  • Well, look at what the last `For Loop` does. Essentially there's no difference between having multiple characters as a delimiter, and single characters. The problem people would run into (errors being thrown around) was because their array would be filled with blanks. So I got rid of the blanks with the function, essentially turning multiple character delimiters into single character delimiters, leaving only the desired values left over to fill the array. :D – Tyeler Oct 05 '16 at 20:35
  • Since you've done this as an "answer my own question" post, you could also accept the answer. – Victor Moraes Oct 05 '16 at 20:40
  • @nbayly What about [regular expressions](http://stackoverflow.com/a/39884122/111794)? – Zev Spitz Oct 05 '16 at 22:21
  • @nbayly I didn't understand what you had meant that it couldn't do multi-character delimiters until I tried using it in a different way (i tried using it on code and delimiting the word "For" and what do you know it split the code up with "f", "o", and "r" as delimiters..). I've updated the function to include an actual multi-character delimiter option. – Tyeler Oct 06 '16 at 12:49
1

You were on the right track with your function. Using a ParamArray you can easily change the number and position of your delimiters.

Code

Function MultiSplit(SourceText As String, ParamArray Delimiters()) As String()
    Dim v As Variant

    For Each v In Delimiters
        SourceText = Replace(SourceText, v, "•")
    Next

    MultiSplit = Split(SourceText, "•")

End Function

Test

Sub Test()
    Const example As String = "A,B C;D > E"
    Dim a1, a2, a3, Incorrect

    Incorrect = MultiSplit(example, " ", " > ")
    a1 = MultiSplit(example, " > ", ",", ";", " ")
    a2 = MultiSplit(example, " > ", ",")
    a3 = MultiSplit(example, " > ")
End Sub

Result

enter image description here

NOTE: When using multi-character delimiters, the order that the delimiters are processed matters. Notice that A1 is split proper but Incorrect is not split as intended because the space delimiter came before " > ".

  • I never knew something like `ParamArray` existed. I searched everywhere for a function that could do something similar to what you have here and couldn't find anything remotely close to that. +1 for simplicity. – Tyeler Oct 06 '16 at 13:20
  • Thanks Note: you can't use optional parameters with ParamArray and you ParamArray is limited to 29 parameters in xl 2003 or older and 60 parameters in 2007 or newer. –  Oct 06 '16 at 13:24
  • I'm checking it out right now. I thought I was sharing something cool but this has become more of a learning experience for me than providing anything useful. I did fix the function in my answer to actually answer my own question. It's a lot more coding than this though, and I'm worried it isn't as user-friendly or intuitive as I thought it was. Thanks for your answer!! – Tyeler Oct 06 '16 at 13:34
  • @Tyeler thanks for the edit. It was good work, I will gladly upvote it if you post it as another answer. I rejected it because I wanted to keep my answer simple. I am still on the fence on whether the sort is the right thing to do (e.g. " >" or "> " could make a difference). I think that the user needs to ensure that the order of operations doesn't matter when he chooses them. –  Oct 06 '16 at 15:06
  • @Tyeler Often times it is necessary to have empty elements. Take for instance a CVS file. It is understood that every line in the file has same number of columns. If you were to remove the empty elements when you parsed the file then the columns would not line up properly. –  Oct 06 '16 at 15:06
  • That's true. It's difficult to write this in an all inclusive way. I think a function like this is something excel needs, but I can think of instances in which having those blanks in the arrays would cause errors, but then there's times like you just said, where those blanks are desirable. Maybe a boolean? – Tyeler Oct 06 '16 at 15:10
  • I noticed at the end of your routines that you `Erase` your arrays. It really isn't necessary. The variables within a routine share the same scope as the routine. When the routine exits the VBA garbage collector cleans them all up. You would want to Erase large arrays, if there was more code to execute before the procedure ended. –  Oct 06 '16 at 15:27
  • True! I don't know why I had it in my head that the arrays would stay built until the next use of the function.. I don't know why I keep thinking arrays are static.... I added the answer like you suggested with the boolean statement included. – Tyeler Oct 06 '16 at 15:45
1

The following is a built-upon version of the code that Thomas Inzina graciously provided.

The following limitations have been removed:

  • The order that the delimiters are listed in the function.
  • The temporary delimiter being a set specific character.
  • The option to include or remove empty array items.
  • The function changing the reference (ByRef vs ByVal)
  • Passing an array of delimiters vs listing individual delimiters
Function MultiSplitX(ByVal SourceText As String, RemoveBlankItems As Boolean, ParamArray Delimiters()) As String()
    Dim a As Integer, b As Integer, n As Integer
    Dim i As Integer: i = 251
    Dim u As Variant, v As Variant
    Dim tempArr() As String, finalArr() As String, fDelimiters() As String

    If InStr(TypeName(Delimiters(0)), "()") <> 0 And LBound(Delimiters) = UBound(Delimiters) Then
        ReDim fDelimiters(LBound(Delimiters(0)) To UBound(Delimiters(0))) 'If passing array vs array items then
        For a = LBound(Delimiters(0)) To UBound(Delimiters(0))            'build that array
            fDelimiters(a) = Delimiters(0)(a)
        Next a
    Else
        fDelimiters = Delimiters(0)
    End If

    Do While InStr(SourceText, Chr(i)) <> 0 And i < 251 'Find an unused character
        i = i + 1
    Loop
    If i = 251 Then 'If no unused character in SourceText, use single character delimiter from supplied
        For a = LBound(fDelimiters) To UBound(fDelimiters)
            If Len(fDelimiters(a)) = 1 Then i = Asc(fDelimiters(a))
        Next a
    End If
    If i = 251 Then 'If no single character delimiters can be used, error.
        MsgBox "SourceText uses all character type." & vbCrLf & "Cannot split SourceText into an array.", _
            vbCritical, "MultiSplitX Run-Time Error"
        Exit Function
    End If
    Debug.Print i


    For a = LBound(fDelimiters) To UBound(fDelimiters) 'Sort Delimiters by length
        For b = a + 1 To UBound(fDelimiters)
            If Len(fDelimiters(a)) < Len(fDelimiters(b)) Then
                u = fDelimiters(b)
                fDelimiters(b) = fDelimiters(a)
                fDelimiters(a) = u
            End If
        Next b
    Next a

    For Each v In fDelimiters 'Replace Delimiters with a common character
        SourceText = Replace(SourceText, v, Chr(i))
    Next

    tempArr() = Split(SourceText, Chr(i)) 'Remove empty array items
    If RemoveBlankItems = True Then
        ReDim finalArr(LBound(tempArr) To UBound(tempArr))
        n = LBound(tempArr)
        For i = LBound(tempArr) To UBound(tempArr)
            If tempArr(i) <> "" Then
                finalArr(n) = tempArr(i)
                n = n + 1
            End If
        Next i
        n = n - 1
        ReDim Preserve finalArr(LBound(tempArr) To n)

        MultiSplitX = finalArr
    Else: MultiSplitX = tempArr
    End If
End Function

Use of this function doesn't change from how Thomas had it, with the exception that there's an added boolean statement.


Example 1

In this example, RemoveBlankItems has been set to True.

Sub Example1()
    Dim myString As String, c, n

    n = 0
    myString = "The,Quickupside-downBrownjelloFox_Jumped[Over]             ThegiantLazyjelloDog"

    For Each c In MultiSplitX(myString, True, ",", "-", "upside-down", "jello", " ", "[", "]", "giant", "_")
        Debug.Print "(" & n & ") = " & c
        n = n + 1
    Next c
End Sub

This results in the following output:

enter image description here


Example 2

In this example we have RemoveBlankItems set to False.

Sub Example2()
    Dim myString As String, c, n

    n = 0
    myString = "The,Quickupside-downBrownjelloFox_Jumped[Over]             ThegiantLazyjelloDog"

    For Each c In MultiSplitX(myString, True, ",", "-", "upside-down", "jello", " ", "[", "]", "giant", "_")
        Debug.Print "(" & n & ") = " & c
        n = n + 1
    Next c
    Debug.Print myString
End Sub

This results in the following output:

enter image description here


Example 3

In this example, instead of listing our delimiters in the function, we have them typed out in a string and insert an array in the function instead:

Sub Example3()
    Dim myString As String, c, n
    Dim myDelimiters As String

    n = 0
    myString = "The,Quickupside-downBrownjelloFox_Jumped[Over]             ThegiantLazyjelloDog"
    myDelimiters = ",|-|upside-down|jello| |[|]|giant|_"

    For Each c In MultiSplitX(myString, True, Split(myDelimiters, "|"))
        Debug.Print "(" & n & ") = " & c
        n = n + 1
    Next c
    Debug.Print myString
End Sub

This has the same result as if they had been listed individually:

enter image description here


The Reason RemoveBlankItems Is Desirable

There are some instances in which you DON'T want to have blanks in your array. An example of this would be if you're using your array as a bank of search words that are cycling through a range on a spread sheet. Another example would be if you're manipulating strings of text based on values in the array.

There are also times when you would want to retain the blanks in the array. As Thomas described, in the event you're using this on a CSV file, where maintaining the spaces as columns is desired. Or you're using it to break apart, for example, HTML coding and wish to retain the line format.

Tyeler
  • 1,088
  • 1
  • 12
  • 26
  • 1
    Looks good. Have you been to [Code Review](http://codereview.stackexchange.com/)? I think that you'd like that site a lot. –  Oct 06 '16 at 18:41
  • I haven't. This is my first post where I shared a question-answer because I used the first function I had put up to help someone else with their question and it worked so perfect for that scenario, and was unique to anything I could find on web, that I wanted to share it and hopefully help others who were in the same bind. I am going to see what CodeReview has to offer, because I've gained way more than I thought I would out of this experience. Thank you :) – Tyeler Oct 07 '16 at 01:19
0

Perhaps:

Sub Example()

    Dim exString As String
    Dim myString() As String

    exString = "A,B C;D > E"
    exString = Replace(exString, ",", " ")
    exString = Replace(exString, ";", " ")
    exString = Replace(exString, ">", " ")
    exString = Application.WorksheetFunction.Trim(exString)

    myString() = Split(exString, " ")

    msg = ""
    For Each a In myString
        msg = msg & vbCrLf & a
    Next a

    MsgBox msg
End Sub

enter image description here

Gary's Student
  • 95,722
  • 10
  • 59
  • 99
  • This was the approach I had initially tried, but I felt the code just looked repetitive, which to me has always meant I can loop it. I was also running into issues where multiple characters being delimited in some cases, and not in others was throwing errors where I had it applied. – Tyeler Oct 06 '16 at 00:53