0

I was trying to use the solution provided in the following link to parse email responses programmatically: Parse email content from quoted reply

it works fine in most cases except for gmail and outlook. It also picks the sender line:
On Sun, Mar 31, 2013 at 10:57 AM, < abc@domain.com> wrote:

I do not understand regex much, but the following one should have parsed it correctly:

new Regex("From:\\s*" + Regex.Escape(address), RegexOptions.IgnoreCase)
new Regex("\\n.*On.*(\\r\\n)?wrote:\\r\\n", RegexOptions.IgnoreCase | RegexOptions.Multiline)

Sample Data:
Do read it.\r\n\r\n\r\nOn Sun, Mar 31, 2013 at 10:57 AM, <\r\n abc@domain.com > wrote:\r\n\r\n>

Expected Outcome:
Do read it.

Current Outcome:
Do read it. On Sun, Mar 31, 2013 at 10:57 AM, wrote:

Community
  • 1
  • 1
Amit M
  • 11
  • 1
  • 5

1 Answers1

1

Use a capturing group to get a part of this match:

new Regex("\\n(.*)[\\r\\n]*On(?:.|\\r|\\n)*?wrote:\\r\\n", RegexOptions.IgnoreCase | RegexOptions.Multiline)

Also, use lazy operators instead of greedy ones: .* => .*?
The provided link will tell you why.

Edit: As my comment specifies, \r and \n won't be matched by dots. It also says that suggesting you to use lazy operators was pretty stupid though I'll let it because it's still knowledge worth having for the future.

Edit2: In fact it was not for the second part on the regex. Edited.

Loamhoof
  • 8,293
  • 27
  • 30
  • Did that work for you ? I tried for the sample data in original question and it did not for me. I have never used lazy, so will look to optimize these to Lazy format. Thanks for suggesting – Amit M Apr 02 '13 at 09:16
  • The dot doesn't take \r and \n into account. Editing to match those. – Loamhoof Apr 02 '13 at 09:19
  • Can you tell me if I had to check for an email (specific email, can be hardcoded) before that 'wrote:', how can I do that. Coz this solution will get the first \r\nOn in user text too. – Amit M Apr 02 '13 at 09:42
  • new Regex("\\n.*On.*<(\\r\\n)?" + Regex.Escape(address) + "(\\r\\n)?>", RegexOptions.IgnoreCase) worked for me. Thanks. – Amit M Apr 02 '13 at 09:48
  • Oh yeah. definitely. I only built on after understanding your solution. – Amit M Apr 02 '13 at 10:21