0

I am looking for a way to replace keywords within a html string with a variable. At the moment i am using the following example.

returnString = Replace(message, "[CustomerName]", customerName, CompareMethod.Text)

The above will work fine if the html block is spread fully across the keyword.

eg.

<b>[CustomerName]</b>

However if the formatting of the keyword is split throughout the word, the string is not found and thus not replaced.

e.g.

<b>[Customer</b>Name]

The formatting of the string is out of my control and isn't foolproof. With this in mind what is the best approach to find a keyword within a html string?

Madi D.
  • 1,980
  • 5
  • 23
  • 44
53an
  • 171
  • 7
  • 1
    How is it possible to have HTML formatting partially inside the keyword? When replacing such a keyword it is impossible to guess where the HTML formatting should be placed in the replacement text. – Veger Jan 13 '10 at 12:40
  • The function which I am attempting to create, gets passed a html string to it. It comes from a basic html editor, which has the possibility to allow the user to apply a style block to any number of characters. This allows for the potential for a segment of the keyword to have a style. It looks as if the problem is not solvable at this point and attention should be focused on considering how the string is created originally, to remove the possibility of this occurring. – 53an Jan 13 '10 at 12:58
  • What language are you using to write the function? There are many HTML templating engines which would make this easy, though you'd have to use their syntax. –  Jan 13 '10 at 13:02
  • Sorry I have just realised that I haven't included the language I am using in the original post. I am using VB.net. From some google searches it was suggested I could use System.Xml? I was unsure about how i would go about this? – 53an Jan 13 '10 at 13:07

3 Answers3

0

Try using Regex expression. Create your expressions here, I used this and it works well.

http://regex-test.com/validate/javascript/js_match

Ravi Vanapalli
  • 9,805
  • 3
  • 33
  • 43
  • Are you suggesting to parse HTML with regex? http://stackoverflow.com/questions/1732348/#1732454 –  Jan 13 '10 at 13:04
0

Use the text property instead of innerHTML if you're using javascript to access the content. That should remove all tags from the content, you give back a clean text representation of the customer's name.

For example, if the content looks like this:

<div id="name">
    <b>[Customer</b>Name]
</div>

Then accessing it's text property gives:

var name = document.getElementById("name").text;
// sets name to "[CustomerName]" without the tags

which should be easy to process. Do a regex search now if you need to.

Edit: Since you're doing this processing on the server-side, process the XML recursively and collect the text element's of each node. Since I'm not big on VB.Net, here's some pseudocode:

getNodeText(node) {
    text = ""
    for each node.children as child {
        if child.type == TextNode {
            text += child.text
        }
        else {
            text += getNodeText(child);
        }
    }
    return text
}

myXml = xml.load(<html>);
print getNodeText(myXml);

And then replace or whatever there is to be done!

Anurag
  • 140,337
  • 36
  • 221
  • 257
0

I have found what I believe is a solution to this issue. Well in my scenario it is working.

The html input has been tweaked to place each custom field or keyword within a div with a set id. I have looped through all of the elements within the html string using mshtml and have set the inner text to the correct value when a match is found.

e.g.

Function ReplaceDetails(ByVal message As String, ByVal customerName As String) As String
    Dim returnString As String = String.Empty
    Dim doc As IHTMLDocument2 = New HTMLDocument
    doc.write(message)
    doc.close()
    For Each el As IHTMLElement In doc.body.all
        If (el.id = "Date") Then
            el.innerText = Now.ToShortDateString
        End If
        If (el.id = "CustomerName") Then
            el.innerText = customerName
        End If
    Next
    returnString = doc.body.innerHTML
    return returnString

Thanks for all of the input. I'm glad to have a solution to the problem.

53an
  • 171
  • 7