1

Let me say, I hate working with strings! I'm trying to find a way to split a string on brackets. For example, the string is:

Hello (this is) me!

And, from this string, get an array with Hello and me. I would like to do this with parentheses and braces (not with brackets). Please note that the string is variable, so something like SubString wouldn't work.

Thanks in advance,

FWhite

PWhite
  • 141
  • 1
  • 12
  • what you want is a parser for any sort of general solution. the code will have to contend with mismatched sets, repeat spaces and embedded parens/brace sets (`Hello, (this {is}) me!`). – Ňɏssa Pøngjǣrdenlarp Nov 13 '14 at 13:47

3 Answers3

0

Try this code:

    Dim var As String = "Hello ( me!"
    Dim arr() As String = var.Split("(")

    MsgBox(arr(0))      'Display Hello
    MsgBox(arr(1))      'Display me!
Dotnetter
  • 261
  • 1
  • 5
0

Something like this should work for you:

Dim x As String = "Hello (this is) me"
Dim firstString As String = x.Substring(0, x.IndexOf("("))
Dim secondString As String = x.Substring(x.IndexOf(")") + 1)
Dim finalString = firstString & secondString

x = "Hello (this is) me"

firstString = "Hello "

secondString = " me"

finalString = "Hello  me"
macoms01
  • 1,110
  • 13
  • 22
  • actually `finalString` will be `"Hello__me"` - with 2 spaces; it may not matter to the OP, since they just care about am array of split up parts though – Ňɏssa Pøngjǣrdenlarp Nov 13 '14 at 13:49
  • Good point - I actually put two spaces but SO removed one of them I guess... Just updated the answer to show the two spaces. It could easily be fixed by adjusting the starting index or length of the string to retrieve. – macoms01 Nov 13 '14 at 13:50
  • Hi, thank you for your solution. The problem is that I don't always have the same number of parentheses: I can have also `Hello (this) is (another) example`. _FWhite_ – PWhite Nov 13 '14 at 13:51
  • You could throw this in a loop that checks for the index of an opening parentheses. One second and I'll post an updated answer. – macoms01 Nov 13 '14 at 13:52
  • *you need a parser* not a simple split routine since you are skipping material based on a detected start char until a matching stop char is found. @PWhite – Ňɏssa Pøngjǣrdenlarp Nov 13 '14 at 13:58
  • 1
    It is possible to use regex but you should explain better the cases for exclusion. For `Hello (this) is (another) example` you want Hello is example, for `Hello (this) {is} (another) example` ? Or do you have just parentheses and not curly braces to account for exclusion? – Steve Nov 13 '14 at 14:07
  • Hi Steve, thanks for your help. I want to exclude the text inside **all** parentheses. `Hello (this) {is} (another) example` should return as an array containing `Hello` and `example`. Thanks _FWhite_ – PWhite Nov 13 '14 at 14:18
0

You can use regular expressions (Regex), below code should exclude text inside all parenthesis and braces, also removes an exclamation mark - feel free to expand CleanUp method to filter out other punctuation symbols:

Imports System.Text.RegularExpressions

Module Module1

  Sub Main()
    Dim re As New Regex("\(.*\)|{.*}") 'anything inside parenthesis OR braces
    Dim input As String = "Hello (this is) me and {that is} him!"
    Dim inputParsed As String = re.Replace(input, String.Empty)

    Dim reSplit As New Regex("\b") 'split by word boundary
    Dim output() As String = CleanUp(reSplit.Split(inputParsed))
    'output = {"Hello", "me", "and", "him"}
  End Sub

  Private Function CleanUp(output As String()) As String()
    Dim outputFiltered As New List(Of String)
    For Each v As String In output
      If String.IsNullOrWhiteSpace(v) Then Continue For 'remove spaces
      If v = "!" Then Continue For 'remove punctuation, feel free to expand
      outputFiltered.Add(v)
    Next
    Return outputFiltered.ToArray
  End Function

End Module

To explain the regular expression I used (\(.*\)|{.*}):

  1. \( is just a (, parenthesis is a special symbol in Regex, needs to be escaped with a \.
  2. .* means anything, i.e. literally any combination of characters.
  3. | is a logical OR, so the expression will match either left or ride side of it.
  4. { does not need escaping, so it just goes as is.

Overall, you can read this as Find anything inside parenthesis or braces, then the code says replace the findings with an empty string, i.e. remove all occurrences. One of the interesting concepts here is understanding greedy vs lazy matching. In this particular case greedy (default) works well, but it's good to know other options.

Useful resources for working with Regex:

Community
  • 1
  • 1
Victor Zakharov
  • 25,801
  • 18
  • 85
  • 151