7

Given a sample xml file:

<root>
  <tag attr="value">Content</tag>
  <tag attr="value2">Content</tag>
</root>

how do i replace every tag with a different tag so i get a different file:

<root>
  <tag2 attr2="value"/>
  <tag2 attr2="value2"/>
</root>

The documentation [1] seems to use Filters, is there a way to accomplish this with arrows alone?


Update

i am now at the point where i can replace a node like this:

runX $ readDocument [] "in.xml" 
       >>> processTopDown( 
               (eelem "tag2" += sattr "attr2" "XXX" ) 
               `when` (isElem >>> hasName "tag") ) 
       >>> writeDocument [] "test.xml"

but i have no idea on how to get the attribute right.


[1] http://www.haskell.org/haskellwiki/HXT#Transform_external_references_into_absolute_reference

fho
  • 6,787
  • 26
  • 71

1 Answers1

2

Try setElemName, processAttrl, and changeAttrName from Text.XML.HXT.XmlArrow:

runX $ readDocument [] "in.xml" >>> transform >>> writeDocument [] "test.xml"
  where
    transform = processTopDown $
      ( setElemName (mkName "tag2") >>>
        processAttrl (changeAttrName $ mkName . attrMap . localPart)
      ) `when` (isElem >>> hasName "tag")
    attrMap "attr" = "attr2"
    attrMap a = a

This works for me with your sample document.

Travis Brown
  • 138,631
  • 12
  • 375
  • 680
  • It seems that this doesn't remove the *Content* part of the tag. Additionally (and that's not in the question) i'd need to remove all other attributes. OTOH maybe you can point me to some decent documentation? I'm kind of lost dealing with HXT. – fho Mar 07 '12 at 22:10
  • Thanks again ... i wrote a solution based on this answer. – fho Mar 07 '12 at 23:35
  • @Florian: You can use something like `changeText (const "")` to clear the text content, and `changeAttrl` to delete the other attributes. Not sure I can point to any _decent_ HXT tutorials, but I've found the Hackage documentation (particularly `Control.Arrow.ArrowList` and `Text.XML.HXT.Arrow.XmlArrow`) very helpful. – Travis Brown Mar 08 '12 at 02:49
  • i now use `processChildren none` which seems to remove everything below the current node. The `process*` functions are pretty useful :) – fho Mar 08 '12 at 20:31