3

I want to load HTML content into a WPF WebBrowser object using it's NavigateToString method. The HTML content contains relative paths (*). How can I set the base URL for the WebBrowser so that all images, JavaScript, etc are loaded correctly?

(*) I have edited an existing, unanswered question to make it more self explanatory. I don't know about the original OP's application; but I am fetching HTML, modifying it (applying highlights to sections of text); and then trying to display it using a WPF WebBrowser in .NET 4. Perhaps one approach might be add a HTML prefix to the string?

Cœur
  • 37,241
  • 25
  • 195
  • 267
shefintk
  • 61
  • 6

1 Answers1

3

The page's Base URL needs to be modified. This tells the browser where the page should appear to be. First, this can be performed by using the HTML base tag. This could be quickly inserted at the beginning of the HTML and most browsers will probably read it okay, although it is NOT correct HTML. Instead it should ideally be inserted into the header section (head tag).

Here is some inelegant C# code that does this:

    /// <summary>
    /// Insert a base href tag into the header part of the HTML
    /// If a head tag cannot be found, it is simply inserted at the beginning
    /// </summary>
    /// <param name="input_html">The HTML to process</param>
    /// <param name="url">URL for the base href tag</param>
    /// <returns>The processed HTML</returns>
    static private string InsertBaseRef(string input_html, string url)
    {
        string base_tag = "<base href=\"" + url + "\" />"; //  target=\"" + url + "\" />";
        Regex ItemRegex = new Regex(@"<head\s*>", RegexOptions.Compiled | RegexOptions.IgnoreCase);

        Match match = ItemRegex.Match(input_html);
        if (match.Success)
        {
            // only replace the first match
            return ItemRegex.Replace( input_html, match.Value + base_tag, 1 );

        }

        // not found, so insert the base tag at the beginning
        return base_tag + input_html;            
    }

Note that this only searches for a simple head tag without any attributes. HTML with a head tag that has attributes, and HTML that is completely missing a head tag will fail in the search, and the base tag will be simply inserted at the beginning. Yes, the code should ideally check for a head tag with attribute definitions.

The above code will fetch relative URL images correctly on a (Win7 + .NET 4 WPF) system. However it still has problems with JavaScript. I could not find a proper solution to similarly set the JavaScript base URL for all referenced JS files. However, for my desktop application, simply suppressing the JS errors is sufficient (I'm displaying static pages which have been modified/annotated). This suppression can be performed using the answer here. As this talks directly to the underlying browser COM object, I doubt it will work with WP7.

Community
  • 1
  • 1
winwaed
  • 7,645
  • 6
  • 36
  • 81