0

INPUT: garbage="stff",start1="allshortsofCharactersExceptDoubleQuotes",start2="*&^%$blahblah"

DESIRED RESULT: allshortsofCharactersExceptDoubleQuotes

*&^%$blahblah

Using c# .NET:

string myRegExString = @"(?<=start[0-9].).*(?="")"

Yeilds: allshortsofCharactersExceptDoubleQuotes",start2="*&^%$blahblah

Through testing I know that if I replaced .* with a set that had all the characters except double quotes I would get the desired result but that is a lot of work and I will get that wrong. Also using (?!"") or (?!="") before .* does not work either.

So how do I get the lookahead to stop on the first double quote it finds?

Correct Answers (as far as I tested) from the responses:

(?<=start\d+="")[^""]*(?="")

OR

(?<=start\d+="")[^""]+(?="")

OR this works too but is not quite what was asked for.

(?<=start\d+="")[^""]*

Thanks. I was so wrapped up in the lookahead aspect of this item.

3 Answers3

1

You should use lazy quantifier .*? which would match as less as possible..In your case .* would match as much as possible and hence it would capture till last "

(?<=start\d+="").*?(?="")

You could get a list of such of values using this code

List<string> output=Regex.Matches(input,regex)
                         .Cast<Match>()
                         .Select(x=>x.Value)
                         .ToList();
Anirudha
  • 32,393
  • 7
  • 68
  • 89
0

The problem with your regular expression is that .* is matching too much text. You can make your regular expression lazy by adding a question mark after the star like '.*?' Or you can change it to match every thing except double quoutes with: '[^"]*' which is what I would choose in this case. The following should work. Not tested

string myRegExString = @"(?<=start[0-9].)[^""]*(?="")"

The other solution I suggests is:

string myRegExString = @"(?<=start[0-9].).*?(?="")"
rbernabe
  • 1,062
  • 8
  • 10
  • the first is not going to match anything since you are missing `"`..in second your result would include 1st `"` – Anirudha Jul 12 '13 at 06:09
  • Please see http://stackoverflow.com/questions/1928909/in-c-can-i-escape-a-double-quote-in-a-verbatim-string-literal The double quote is escaped by an extra double quote in string lterals – rbernabe Jul 12 '13 at 06:17
  • Thanks! Both of those worked. I got so wrapped up in the lookahead I could not see the obvious :) – frakenweenie Jul 12 '13 at 13:40
0

You can use this:

@"(?<=start\d="")[^""]+(?="")"

the result is the whole pattern.

Casimir et Hippolyte
  • 88,009
  • 5
  • 94
  • 125