1

I have an XML file with the following structure:

<?xml version="1.0" encoding="UTF-8"?>
  <header>
    <name>generic_1</name>
  </header>
  <body>
    <resources>
      <resource guid="ae8c34ad-a4e6-47fe-9b7d-cd60223754fe">
      </resource>
      <resource guid="fe236467-3df5-4019-9d55-d4881dfabae7">
      </resource>
    </resources>
  </body>

I need to edit the information of each resource so I tried to split the file by the string </resource> but TCL doesn't split it properly.

This is what I tried: split $file "</resource>". I also tried escaping the <, / and > characters but still no success.

Can you please help me with an elegant solution? I can do it by taking each line and determining where the resource ends, but a split would be nicer, if it can be done.

LE: I can't use tdom, I am editing the file as a text file, not as a XML file.

Thank you

Lucian
  • 115
  • 2
  • 12
  • Have you searched for previous answers on `[xml]` and `[tcl]`? This has been covered numerous times. – mrcalvin Apr 19 '18 at 22:23
  • This sounds like an XY problem: what are you trying to do really? – glenn jackman Apr 20 '18 at 13:34
  • @mrcalvin yes, I have searched, but I didn't find quite this answer; @"glenn jackman" my XML has some tags in it and I need to modify those, but I need to separate the resources because I might modify the wrong resource – Lucian Apr 20 '18 at 15:06
  • XML is very hard to edit as text without breaking the structure. The easy part is to find the place to edit and make changes there. The many times harder part is to ensure that what you save back to file after editing is still valid XML where nothing but your edits have changed. (This goes for any language, not just Tcl.) – Peter Lewerin Apr 20 '18 at 16:15

2 Answers2

4

Suggestion

XML handling in Tcl has been handled numerous times here. It is generally recommended that you use tdom and XPath expressions to navigate the DOM and extract data:

package req tdom
set doc  [dom parse $xml]
set root [$doc documentElement]
$root selectNodes //resources/resource

Comment

split breaks up a string on a per-character basis. The last argument to split is interpreted as a number of split characters, rather than one split string. Besides, it would not give you what you want.

mrcalvin
  • 3,291
  • 12
  • 18
  • Yes, you are right, tdom has been covered multiple times only I can't use tdom, that's why I'm looking for another solution – Lucian Apr 20 '18 at 15:09
  • Why can't you use tdom? You would be left with `regexp` and `regsub`, but this will not get you far (depending on the document structure and what you are after). Before saying "I can't use tdom", you should then rather resort to: "I can't use XML, get me another format." – mrcalvin Apr 20 '18 at 15:48
  • Because I am not running native TCL, and the TCL I am running doesn't support tdom. – Lucian Apr 20 '18 at 15:52
  • Well, so how about: https://stackoverflow.com/questions/49326362/how-to-get-a-xml-element-value-in-tcl/49328290#49328290 ? – mrcalvin Apr 20 '18 at 15:56
  • Still, be warned this will not get you far. Better spent the time to get a Tcl with tdom or the data in a format other than XML. – mrcalvin Apr 20 '18 at 15:58
  • Yes, that would work perfectly, once I have each in a separate variable. Thank you for all the suggestions! – Lucian Apr 20 '18 at 16:08
2

This is not an answer, just two additions to mrcalvin's answer, put here for formatting purposes.

First, your XML is invalid, as it lacks a root element (maybe it's snipped out).

Second, you didn't describe in what manner you wanted to edit the nodes. Two obvious ways is to add a new attribute value and to add a new child node. This is how you can select to do each with tdom based on the value of the guid attribute:

set nodes [$root selectNodes //resources/resource]
foreach node $nodes {
    switch [$node getAttribute guid] {
        ae8c34ad-a4e6-47fe-9b7d-cd60223754fe {
            $node setAttribute foo bar
        }
        fe236467-3df5-4019-9d55-d4881dfabae7 {
            $node appendChild [$doc createElement quux]
        }
        default {
            error "unknown resource"
        }
    }
}

If you wish to add something more complex than a child node, there are several ways to do so, including using node commands, appending an XML literal, appending via a script (most useful when several similar additions are made), and appending a nested Tcl list that describes a node structure with attributes.

You can then get the edited DOM structure as XML by calling $doc asXML.

Peter Lewerin
  • 13,140
  • 1
  • 24
  • 27