1

Using c# I am trying to use regular expressions to get the value of a piece of text that appears after a colon. I know the field name, but there is a following colon that is in a variable position. The value would end at the next whitespace.

Thus:

KnownFieldName : Value

I'd then like to place the value into a group.

I've found a number of similar questions but none that actually points me in the direction of solving this.

This is part of a larger piece of code, but basically it fits in here:

     foreach (var v in fieldsToParse)
                {
                    var match = Regex.Match(line, v.pattern, RegexOptions.IgnorePatternWhitespace);

                    if (match.Success)
                        v.value = match.Groups[v.name].Value;
                }   
KerSplosh
  • 466
  • 8
  • 26
  • 1
    What's the rest of your piece of text look like? – lc. Nov 06 '12 at 17:12
  • The rest may contain further fields and answers – KerSplosh Nov 06 '12 at 17:24
  • Is it all just fields and answers, separated by newlines, or is there more to it? – lc. Nov 06 '12 at 17:26
  • Take a look at http://stackoverflow.com/questions/11088873/regex-to-capture-colon-separated-key-value-pairs-with-multi-line-values/11090412#11090412 which handles multi-line values as well. Or, tl;dr, the Rubular demo: http://rubular.com/r/8w3X6WGq4l. – Andrew Cheong Nov 06 '12 at 17:37
  • Thanks for the suggestions, it was a combination of my lack of knowledge of c# implementation of Regex and Regex that let me down! – KerSplosh Nov 06 '12 at 18:39

1 Answers1

0

The regex you're looking for should be something like (?<knownfieldname>[a-zA-Z]+)(\s+):(\s+)(?<value>[a-zA-Z]+) or knownfieldname(\s+):(\s+)(?<value>[a-zA-Z]+)

This presumes that KnownFieldName and Value are both only alpha characters, of course. If numerics can come into it you may also want to add 0-9 to the ranges, or if anything but white space is really valid there you can just use the \S wildcard I guess.

Edited to incorporate a variable number of spaces. Also, in light of the comments below, you might consider bookmarking both Derek Slager's awesome Regex tester and this simple Regex cheat sheet. Helps a lot with situations like this.

tmesser
  • 7,558
  • 2
  • 26
  • 38
  • I need to replace the with an an actual string. How does this work with regular expressions? – KerSplosh Nov 06 '12 at 17:26
  • @KerSplosh is just the name of that regex group. To put it in the context of what you posted, it'd be `match.Groups["knownfieldname"].Value;`. But if you wanted to just make that a regex literal, you can just put it in longhand - `knownfieldname : (?[a-zA-Z]+)`. Most things in regex are just interpreted literally. – tmesser Nov 06 '12 at 17:30
  • Sorry, but Im still not getting any matches with this. Note the colon is a variable number of spaces after the known field. – KerSplosh Nov 06 '12 at 17:52
  • @KerSplosh I updated my initial answer to incorporate a variable number of spaces. As before, you can replace the first group with just a static `knownfieldname` if you want. – tmesser Nov 06 '12 at 17:56
  • Brilliant. Thanks so much for being patient with me. – KerSplosh Nov 06 '12 at 18:38