Extract a sentence from mail with Regex

Question

I need to extract with Regex a sentence without the tag <br> but it's give me issues with that.

 (?<=Status:) (.*)[^<br>]

Status: i3 Naviera indicates that the container is already released<br>

This sentence comes from an mail

 "<html>\r\n<head>\r\n<meta http-equiv=\"Content-Type\"
 content=\"text/html; charset=utf-8\">\r\n</head>\r\n<body>\r\nStatus:
 i3 Naviera indicates that the container is already
 released<br>\r\nObservations:  data requested.<br>\r\n<br>\r\n<img
 src=\"http://test/logo/Logo2.png\">\r\n</body>\r\n</html>\r\n"

I just need to extract:

i3 Naviera indicates that the container is already released

`regmatches(v, regexpr("Status:\\s*\\K[^<]+", v, perl=TRUE))` — Wiktor Stribiżew, Dec 16 '19 at 17:39

score 0 · Accepted Answer · answered Dec 16 '19 at 18:02

0

This regex would work for your content:

(?<=Status: )(.*?)(?=<br>)

It matches the Status: with space, and stops at the first <br> and does not include it in the match.

Please note that using regex for html parsing requires that the html content does not change much.

answered Dec 16 '19 at 18:02

Veikko

3,372
2
21
31

Thank you! now it's working! – Anmonr200 Dec 16 '19 at 18:09

Extract a sentence from mail with Regex

1 Answers1