Regex - Match Last Occurrence

Question

I have a text file full of names, I want to match them all via Regex.

Each name ends with the following text: fsa fwb fcc, eg:

">Dave Smith\u0012\/a>\u0012\/div>\u0012div class=\"fsa fwb fcc

I want to use the following expression to match the names:

""">.+?""fsa fwb fcc"

AKA match all text from "> up to fsa fwb fcc, I can then parse the excess matched myself.

However as "> occurs throughout the file, it starts matching from much earlier. I have always wondered how to match from the LAST occurance of something, in this case, ">, up to the end specified.

In your particular case, [`RegexOptions.RightToLeft`](http://msdn.microsoft.com/en-us/library/system.text.regularexpressions.regexoptions.aspx) should do it. — Martin Ender, Aug 15 '13 at 21:01
And what naomik said. [This](http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags?rq=1) is at the top of the related questions. ;) — Martin Ender, Aug 15 '13 at 21:02
This isn't parsing, rather this is pattern matching. Given the requirements I doubt this can be accomplished as easily with an HTML parsing engine as it can be via pattern matching. Also I'm not sure \u0012 is a valid html character. — Ro Yo Mi, Aug 16 '13 at 02:13
Thanks m.buettner, Regex.Options.RightToLeft works perfectly! Exactly what I was looking for. — John Cliven, Aug 16 '13 at 11:54
@neomik, Denomales is correct, this is not a HTML file and the content is static, predictable, and does not vary, so REGEX seems fine for matching. — John Cliven, Aug 16 '13 at 11:55

Rahul Tripathi · Answer 1 · 2013-08-16T11:57:08.337

1

You can try this:-

.+((fsa|fwb|fcc).+)$

+ matches many characters in front.

((fsa|fwb|fcc) matches and captures the keywords.

.+) matches and captures characters.

$ matches the end of the line.

EDIT:- As suggested by m.buettner RegexOptions.RightToLeft should work for your case.

edited Aug 16 '13 at 11:57

answered Aug 15 '13 at 21:02

Rahul Tripathi

168,305
31
280
331

[Please add some explanation](http://meta.stackexchange.com/questions/177757/are-answers-that-just-contain-a-regular-expression-pattern-really-good-answers) on how this regex works. – HamZa Aug 15 '13 at 21:19
1

@HamZa:- Updated with explanation. Do let me know if that doesnt work!! :) – Rahul Tripathi Aug 15 '13 at 21:36
Thanks for the explanation, unfortunately it doesn't work for me, but instead matches the entire file! – John Cliven Aug 16 '13 at 11:53
@stanleyhiggins:- Got your point. Updated my answer as well. So that it can be used for future reference also. :) – Rahul Tripathi Aug 16 '13 at 11:57

score 0 · Accepted Answer · edited May 23 '14 at 23:30

Description

It looks like you're ending string is literally fsa fwb fcc, and the beginning of the substring you're interested in starts directly after the last "> before the end string.

This expression will:

find the substring between the last "> and the next fsa fwb fcc

">((?:(?!">).)*)fsa\sfwb\sfcc

enter image description here

Live Demo

Sample Text

">sometext">A Dave Smith\u0012\/a>\u0012\/div>\u0012div class=\"fsa fwb fcc
">sometext">B Dave Smith\u0012\/a>\u0012\/div>\u0012div class=\"fsa fwb fcc
">sometext">C Dave Smith\u0012\/a>\u0012\/div>\u0012div class=\"fsa fwb fcc

Matches Found:

[0][0] = ">A Dave Smith\u0012\/a>\u0012\/div>\u0012div class=\"fsa fwb fcc
[0][1] = A Dave Smith\u0012\/a>\u0012\/div>\u0012div class=\"

[1][0] = ">B Dave Smith\u0012\/a>\u0012\/div>\u0012div class=\"fsa fwb fcc
[1][1] = B Dave Smith\u0012\/a>\u0012\/div>\u0012div class=\"

[2][0] = ">C Dave Smith\u0012\/a>\u0012\/div>\u0012div class=\"fsa fwb fcc
[2][1] = C Dave Smith\u0012\/a>\u0012\/div>\u0012div class=\"

Or

If you want to go further and only capture from the last "> through to the \u0012 before the fsa fwb fcc ... i.e. the actual name and not the markup text, then have a look at this expression

">((?:(?!">).)*?)\\u0012(?:(?!">).)*fsa\sfwb\sfcc

enter image description here

Live Demo

Sample Text

">sometext">A Dave Smith\u0012\/a>\u0012\/div>\u0012div class=\"fsa fwb fcc
">sometext">B Dave Smith\u0012\/a>\u0012\/div>\u0012div class=\"fsa fwb fcc
">sometext">C Dave Smith\u0012\/a>\u0012\/div>\u0012div class=\"fsa fwb fcc

Matches Found

[0][0] = ">A Dave Smith\u0012\/a>\u0012\/div>\u0012div class=\"fsa fwb fcc
[0][1] = A Dave Smith

[1][0] = ">B Dave Smith\u0012\/a>\u0012\/div>\u0012div class=\"fsa fwb fcc
[1][1] = B Dave Smith

[2][0] = ">C Dave Smith\u0012\/a>\u0012\/div>\u0012div class=\"fsa fwb fcc
[2][1] = C Dave Smith

This is a really great explanation that is so thorough and works perfectly! I really appreciate that Denomales! — John Cliven, Aug 16 '13 at 11:59

Regex - Match Last Occurrence

2 Answers2

Description

Or

Linked