1

I'm sending an e-mail from my application. It's primarily in html format and I'm using a regex to strip out the html tags for a plain-text alternative view (using @"<(.|\n)*?>"). I want to replace the <a> hyperlink tag with a plain-text version of the href address.

I can only seem to find information about converting the other way.

Dave M
  • 1,302
  • 1
  • 16
  • 28
Willis
  • 161
  • 1
  • 12
  • i didn't understand properly. do you want to remove anchor tags and fetch href part of that anchor ?? – FosterZ Dec 07 '11 at 12:36
  • 1
    Check **this** question http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags. Then delete this question. – FailedDev Dec 07 '11 at 12:36
  • First, regex is not the right tool for HTML/XML type languages. Second, please share the code that you've tried so far. – Nayan Dec 07 '11 at 12:37
  • Can you explain what does that plain text view is? Do you just want to remove the HTML Tags from your e-mail and for hyperlink you want to display the URL not any Hyperlink. – Praveen Dec 07 '11 at 12:37
  • FosterZ - yes, that's correct. – Willis Dec 07 '11 at 12:39

2 Answers2

1

If you just want to replace the a tag with the href value and assuming href value has "", here is the regex:

<a[^/>]*href="([^"]*)"/?>

and the replace regex:

$1
Fischermaen
  • 12,238
  • 2
  • 39
  • 56
  • 1
    I think you are missing another pattern after your capture. Unless you are always assuming the href will be the last attribute. Which is usually why regexes aren't recommended for non trivial grammar. – J. Holmes Dec 07 '11 at 12:44
  • @32bitkid: You're absolutely right, but OP asked for a regex. This will help him in a certain way and he has to prove the others. – Fischermaen Dec 07 '11 at 12:48
  • hmmm OK I was hoping to do this without having to understand regex! Not going to happen by the looks of things. Thanks anyway. – Willis Dec 07 '11 at 12:54
  • @Willis: Please don't consider me to be offensive, but using regex without understanding it is dangerous! – Fischermaen Dec 07 '11 at 12:56
  • @Fischermaen What do you mean by dangerous? – Willis Dec 07 '11 at 13:10
  • @Willis: Results can be funny, if you do a replace with some pattern and don't know exactly what's happening. – Fischermaen Dec 07 '11 at 13:13
1
Regex reg=new Regex(@"<a[^>]*href=["]*(?<link>[^\s>"]+)["]*\s*(?:(?:/>)|(?:>[^>]*)>)");
mail.Body=reg.Replace(mail.Body, new MatchEvaluator(delegate(Match m)
{
return m.Groups["link"].Value;
}

Maybe mail client automatically convert plain text to hyperlink, do it

return m.Groups["link"].Value.Replace("http://","");
ebattulga
  • 10,774
  • 20
  • 78
  • 116