1

Regex always make me scratching head.

In my Windows Store App, an html content <a href="www.example.com"> need to be replaced to <a href="javascript:window.external.notify('www.example.com')"> in order to intercept Navigation event in WebView.

I tried Regex.Replace(content, "<a href=\"(.+)\">", "<a href=\"javascript:window.external.notify('\\0')\">"); but no luck.

Could you teach me how to do it in C#?

Erik Philips
  • 53,428
  • 11
  • 128
  • 150
Youngjae
  • 24,352
  • 18
  • 113
  • 198
  • Have your read [**the answer**](http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags) first?... – Alexei Levenkov Apr 21 '14 at 03:15
  • @AlexeiLevenkov // Yes, I read many words about *Do NOT parse HTML with Regex*, but my content is not so heavy, so it *would* be fine to treatment in this way. Parsing HTML and Requesting additional memory-burden process is overkill in my case. – Youngjae Apr 21 '14 at 03:22
  • @AlexeiLevenkov // Or, I appreciate if you give more appropriate approach to handle my work (with C#). – Youngjae Apr 21 '14 at 03:23
  • @Youngjae You mention headscratching with regex. Do you know RegexHero? It's a great tool to try out .Net regex: http://regexhero.net/tester/ – joce Apr 21 '14 at 04:01
  • [Obligatory link](http://stackoverflow.com/q/4231382/471272): please link to answer, not to non-answers. – tchrist Jun 08 '14 at 20:07

3 Answers3

1

This should work for you:

using System;
using System.Text.RegularExpressions;

namespace CSTest
{
    class Program
    {
        static void Main(string[] args)
        {
            Regex re = new Regex("<a href=\"(.+)\">", RegexOptions.Compiled);

            string input = "<a href=\"www.example.com\">";
            string res = re.Replace(input, 
                "<a href=\"javascript:window.external.notify('$1')\">");

            Console.WriteLine(res);
        }
    }
}

You pretty much had it. Your only problem was that you were using \\0 instead of $1 for the matched group.

If you prefer to call the static version of Regex.Replace, you could use:

string res = Regex.Replace(input, 
    "<a href=\"(.+)\">", 
    "<a href=\"javascript:window.external.notify('$1')\">",
    RegexOptions.Compiled
);
joce
  • 9,624
  • 19
  • 56
  • 74
1

I'd give something like this a try:

Regex.Replace(content, "(?<=<a href=\").+(?=\">)", "javascript:window.external.notify('$0')");
grin0048
  • 534
  • 5
  • 13
0

you should be using $1 instead of \\0.

We use $ for backreferences in c#.

aelor
  • 10,892
  • 3
  • 32
  • 48
  • \ is indeed used for backreferences in C#. See http://msdn.microsoft.com/en-us/library/thwdfzxy%28v=vs.110%29.aspx What we're dealing with here is a capture group, which are represented with $: http://msdn.microsoft.com/en-us/library/bs2twtah%28v=vs.110%29.aspx – joce Apr 21 '14 at 04:06
  • @Joce thts strange , coz I keep getting this error : `prog.cs(14,63): error CS1009: Unrecognized escape sequence `\1' Compilation failed: 1 error(s), 0 warnings` tried here : http://ideone.com/MvgIm4 – aelor Apr 21 '14 at 04:10
  • you need to use either @"bla bla \1" (raw string) or "bla bla \\1" (escaped backslash). – joce Apr 21 '14 at 04:12
  • can you edit and save it in another ideone, because when I change it to `\\1` it results to `\1`. – aelor Apr 21 '14 at 04:14
  • Which I see you do use... FYI, your code compile fine in VS2012: http://i.imgur.com/G2S8dxs.png – joce Apr 21 '14 at 04:14
  • ooh there seems to be a misunderstanding , see here sir http://ideone.com/MvgIm4 – aelor Apr 21 '14 at 04:16
  • the above only works if I change the `\\1` to `$1` – aelor Apr 21 '14 at 04:17
  • let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/51071/discussion-between-joce-and-aelor) – joce Apr 21 '14 at 04:17