2
   <result>
        <relatedProducts>
        <item>
            <id></id>
            <name></name>
            <text></text>
        </item>
        <item>
            <id></id>
            <name></name>
            <text></text>
        </item>
        <item>
            <id></id>
            <name></name>
            <text></text>
        </item>
        ...
        </relatedProducts>
        <item>
            <id></id>
            <name></name>
            <intro></intro>
            <detail></detail>
        </item>
            <item>
            <id></id>
            <name></name>
            <intro></intro>
            <detail></detail>
        </item>
        ... 
    </result>

This is a simplified XML structure of the xml file I want to use.

There are might be website urls inside the node text, e.g.

<text>...href="something.com/default.aspx?id=3"...</text>

<detail>...href="something.com/default.aspx?id=25"...</detail>

What I want is in C# loop through all the nodes in this xml document and check the URL link, then change the link based on the id in that link. For example,

I use regular expression to check every node value I see this URL meet the pattern

<text>...href="something.com/default.aspx?id=3"...</text>

And I'd like to change it to

<text>...href="somethingelse.com/query.aspx?rid=3"...</text>

At last, return the whole xml document with the right URL.

Kate Gregory
  • 18,808
  • 8
  • 56
  • 85
Smallville
  • 685
  • 4
  • 13
  • 24
  • "How can I do it?" - you've already provided the solution in your question. Load the XML file. Recurse through all nodes. Update the node values using a regex. Save the result. What part do you need help with? (We can't develop the whole application for you.) – dtb May 12 '11 at 16:03
  • If all you're interested in is finding the URLs, it may not even be worth treating it as an XML file, just read each line in at a time and see if there's a URL there. If there is, transform it and write the line back to a new file. – Roman May 12 '11 at 16:06
  • Thank you for your comments. I encounter another problem which is how to replace the URLs http://stackoverflow.com/questions/5994105/c-search-and-replace-multiple-urls-in-a-string – Smallville May 13 '11 at 15:21
  • @dtb I knew the pseudo code, but I had problem to implement it, that's why I ask the question here. – Smallville May 13 '11 at 15:24

2 Answers2

5
XDocument doc = XDocument.Load(path);
foreach(var element in doc.Descendants())
{
    element.Value = ReplaceUrl(element.Value);
}
doc.Save(path);

I'll let you implement the ReplaceUrl method, since I don't know exactly what you need to do... Just a few general suggestions:

  • you could use regular expressions to extract the URL from the element text (see this question)
  • the easiest way to parse and modify the URL is probably to use the UriBuilder class, which allows you to access the individual components of the URL (scheme, host, path, query string...)
Community
  • 1
  • 1
Thomas Levesque
  • 286,951
  • 70
  • 623
  • 758
  • Thank you for your help! Now I encounter another problem http://stackoverflow.com/questions/5994105/c-search-and-replace-multiple-urls-based-on-the-value-in-the-url I appreciate it if you have any idea – Smallville May 13 '11 at 15:29
1
    Dim xmlDoc As New XmlDocument
    Dim xmlNodeList As XmlNodeList
    Dim xmlNode As XmlNode

    xmlDoc.LoadXml(strXML)
    'xmlNodeList = xmlDoc.GetElementsByTagName("text")'Do this if its a particular tag
    xmlNodeList = xmlDoc.GetElementsByTagName("result")'Or just put the root tag, in my   
                                                       'case result was the root tag

    For Each xmlNode In xmlNodeList
        xmlNode.InnerText = "new text"
    Next

Search for the particular tag and then do a replace.

vikramjb
  • 1,365
  • 3
  • 25
  • 50