1

I have to replace specific links with the link which I pre-defined.

string content = @"<html>"
                +" <a href='http://www.yourdomain.com' style='width:10px;'>Click Here</a>"
                +" <br>"
                +" <a href='http://www.yourdomain.com/products/list.aspx' style='width:10px;'>Click Here</a>"
                +" </html>";

To be replaced like this ===> "http://www.yourdomain.com" to "http://www.mydomain.com"

But I don't want other link which also start with "http://www.yourdomain.com" to be replaced. If these links has a sub link (i.e., "/products/list.aspx").

So I use C# string.Replace() function first.

string result = content.Replace("http://www.yourdomain.com", "http://www.mydomain.com");

I also try Regex.Replace() function as well.

string pattern = @"\bhttp://www.yourdomain.com\b";
string replace = "http://www.mydomain.com";
string result = Regex.Replace(content, pattern, replace);

But I got same result. like below.

<html> 
<a href='http://www.mydomain.com' style='width:10px;'>Click Here</a> 
<br> 
<a href='http://www.mydomain.com/products/list.aspx' style='width:10px;'>Click Here</a> 
</html>

What I want is like below.

<html> 
<a href='http://www.mydomain.com' style='width:10px;'>Click Here</a> 
<br> 
<a href='http://www.yourdomain.com/products/list.aspx' style='width:10px;'>Click Here</a> 
</html>

Update

According to @Robin suggestion, my problem is solved.

string content = @"<html>"
                    +" <a href='http://www.yourdomain.com' style='width:10px;'>Click Here</a>"
                    +" <br>"
                    +" <a href='http://www.yourdomain.com/products/list.aspx' style='width:10px;'>Click Here</a>"
                    +" </html>";

string pattern = string.Format("{0}(?!/)", "http://www.yourdomain.com");
string replace = "http://www.mydomain.com";
string result = Regex.Replace(content, pattern, replace);

Update

Another alternative way which I found is

http://www.yourdomain.com([^/])
Community
  • 1
  • 1
Frank Myat Thu
  • 4,448
  • 9
  • 67
  • 113
  • Look at Html Agility Pack - it's better suited for parsing HTML than RegEx or string functions. – Tim Jun 25 '14 at 07:23
  • If it's something as trivial as your question, how about `string result = content.Replace("'http://www.yourdomain.com'", "'http://www.mydomain.com'");` (I just added the apostrophes). – Omaer Jun 25 '14 at 07:33

2 Answers2

2

Add ''s at the ends of your string arguments in the replace call.

result = content.Replace("'http://www.yourdomain.com'", "'http://www.mydomain.com'");

That way you would only replace the urls without sub links.

Hjalmar Z
  • 1,591
  • 1
  • 18
  • 36
  • 2
    Maybe add a second replace for when `"` might be used instead of `'`. Unless you're sure it isn't used now or in the future. – Flater Jun 25 '14 at 07:36
1

Apart from the usual warning about dealing with HTML using regex, a word boundary \b matches between an alphanum (\w) and a non alphanum (\W), so it matches both between m and ' and between m and /.

To explicitly forbid a / after the end of the URL you can use a negative lookahead, see here:

 http://www.yourdomain.com(?!/)
Community
  • 1
  • 1
Robin
  • 9,415
  • 3
  • 34
  • 45