41

I'm currently trying to split a string in C# (latest .NET and Visual Studio 2008), in order to retrieve everything that's inside square brackets and discard the remaining text.

E.g.: "H1-receptor antagonist [HSA:3269] [PATH:hsa04080(3269)]"

In this case, I'm interested in getting "HSA:3269" and "PATH:hsa04080(3269)" into an array of strings.

How can this be achieved?

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
João Pereira
  • 3,545
  • 7
  • 44
  • 53

2 Answers2

93

Split won't help you here; you need to use regular expressions:

// using System.Text.RegularExpressions;
// pattern = any number of arbitrary characters between square brackets.
var pattern = @"\[(.*?)\]";
var query = "H1-receptor antagonist [HSA:3269] [PATH:hsa04080(3269)]";
var matches = Regex.Matches(query, pattern);

foreach (Match m in matches) {
    Console.WriteLine(m.Groups[1]);
}

Yields your results.

Konrad Rudolph
  • 530,221
  • 131
  • 937
  • 1,214
  • 4
    Do you find it awkward in 3.5 that MatchCollection enumeartor still returns Match as Object? – chakrit Apr 11 '09 at 19:21
  • 5
    anyway... a better regex match might be \[([^\]]*)\] so as to be on the safe side :-) – chakrit Apr 11 '09 at 19:23
  • @chakrit: 1. Yes, but this cannot be changed for backwards compatibility reasons. Really a shame though. Microsoft should have the balls to do like Python 3: throw everything pre-2.0 out for good and introduce a breaking change. But this won't happen … – Konrad Rudolph Apr 11 '09 at 19:25
  • @chakrit: 2. This was indeed my first version (I usually always use explicit groups) but I reconsidered because that's wordier to express exactly the same pattern (for all practical purposes). There's really no risk here in using the more implicit character class along with a nongreedy quantifier. – Konrad Rudolph Apr 11 '09 at 19:27
  • @KonradRudolph - Here I am getting this "[HSA:3269]"....what will be the Regex if I want "HSA:3269". Without square brackets? – Rahul Khandelwal Feb 03 '17 at 18:22
  • @chakrit - Firstly, you need to escape the outer square brackets. Secondly, the results are the same anyway, so why do you say "on the safe side"? – Robino May 16 '17 at 15:44
  • Use LINQ to get the captures from Group 1: `var results = Regex.Matches(query, pattern).Cast().Select(m => m.Groups[1].Value).ToList();` – Wiktor Stribiżew Apr 12 '18 at 12:05
1

Try this

string mystr = "Hello my name is  {robert} and i live in  {florida}";

List<string> myvariables = new List<string>();
while (mystr.Contains("{"))
{
    myvariable.Add(mystr.Split('{', '}')[1]);
    mystr = mystr.Replace("{" + mystr.Split('{', '}')[1] + "}", "");
};

This way I will have an array which will contain robert and florida.

AJ Richardson
  • 6,610
  • 1
  • 49
  • 59
BobSpring
  • 69
  • 1
  • 1
  • 3
    This code assumes that the brackets are always matched perfectly (you never have `}` before `{`, and you never have `{` twice in a row. It is also very inefficient because it splits the string many times needlessly. It would be much more efficient and robust to use Regex like the other answer does. – AJ Richardson Sep 07 '17 at 15:49