1

From what I can tell, normal HXT seems to be more geared toward queries on XML moreso than XML AST refactoring. However, one of the HXT modules, Data.Tree.NTree.Zippers.TypeDefs seems to have some machinery for diving into a document and doing local work, instead of the more global arrows. However, I can't seem to get anything to work. This is a follow-up post to my earlier HXT issue - all the code is the same, except now trans is replacing this.

Here is the entry point to my program:

start :: App -> IO [XmlTree]
start (App src dest) = runX $
                         readDocument [
                                    --... some settings ...
                                      ]   
                                      src
                         >>>
                         trans
                         >>> 
                         writeDocument [
                                     --... some settings ...
                                       ]
                                       dest

And here is the module where trans is defined:

module Main.Internal where

import Data.Maybe (fromJust)

import Text.XML.HXT.DOM.XmlNode (mkText')
import Text.XML.HXT.Core hiding (addToTheRight)
import Data.Tree.NTree.Zipper.TypeDefs

trans :: IOSLA (XIOState s) XmlTree XmlTree
trans = arrL go
  where
    go :: XmlTree -> [XmlTree]
    go x = [fromNTZipper . manip . toNTZipper $ x]

unList :: [a] -> a
unList []    = error "dun goofed!"
unList (x:_) = x

manip = fromJust . (addToTheRight $ mkText' "bar")
      . fromJust . down

Finally, here is my input file:

<html>
  <head>
    <title>foo</title>
  </head>
  <body>
    <h1>foo</h1>
  </body>
</html>

and my output:

<?xml version="1.0" encoding="US-ASCII"?>
<html>
  <head>
    <title>foo</title>
  </head>
  <body>
    <h1>foo</h1>
  </body>
</html>

So, why isn't "bar" anywhere to be found in my output? Shouldn't it appear right after </html>? Any help would, again, be wonderful :)

Community
  • 1
  • 1
Athan Clark
  • 3,886
  • 2
  • 21
  • 39

1 Answers1

1

Your ideas seem to be solid and I'm not sure where you're wrong, but playing around with it I was able to generate the test code:

import Data.Tree.NTree.Zipper.TypeDefs
import Text.XML.HXT.Parser.HtmlParsec
import Text.XML.HXT.DOM.XmlNode
import Text.XML.HXT.DOM.TypeDefs
import Data.Tree.NTree.TypeDefs
import Control.Arrow.IOListArrow
import Text.XML.HXT.Arrow.WriteDocument
str = "<html>\n  <head>\n    <title>foo</title>\n  </head>\n  <body>\n    <h1>foo</h1>\n  </body>\n</html>"

fromJust (Just x) = x

manip :: NTree XNode -> NTree XNode
manip x = fromNTZipper $ fromJust $
        down (toNTZipper x) >>= addToTheLeft (mkText "Boo!") >>= up

stringify = runIOLA $ writeDocumentToString []

main = do
    xs <- mapM stringify $ map manip $ parseHtmlDocument "" str
    putStrLn (show xs)

which outputs [["\n Boo!<head>\n <title>foo</title>\n </head>\n <body>\n <h1>foo</h1>\n </body>\n"]]. I'm not actually sure what happened to the <html> element but addToTheLeft does exactly what it says it does. (I am using the >>= for the Maybe monad above).

I don't know if it's trans or >>> above that, but the manip you're doing seems like it should work.

EDIT: Notice that a lot of what I've written above avoids the idiomatic arrows which are central to HXT, which is probably why I'm getting some strange results. It looks from the package structure that importing Text.XML.HXT.Core is enough for generally reading strings and documents. The following works for me:

Prelude> let file = "<html>\n  <head>\n    <title>foo</title>\n  </head>\n  <body>\n    <h1>foo</h1>\n  </body>\n</html>"
Prelude> :m +Text.XML.HXT.Core
Prelude Text.XML.HXT.Core> let apply (arrows) str = head $ runLA (xshow $ hread >>> arrows) str
Prelude Text.XML.HXT.Core> :t apply
apply :: LA XmlTree XmlTree -> String -> String
Prelude Text.XML.HXT.Core> putStrLn $ apply (withNav $ moveDown >>> addToTheLeft (txt "bar") >>> moveUp) file
<html>bar
  <head>
    <title>foo</title>
  </head>
  <body>
    <h1>foo</h1>
  </body>
</html>

so those are the relevant functions. Notice that HXT already seems to do its Maybe stuff by clobbering XML trees in the list (of the LA list arrow) which do not satisfy a given predicate.

CR Drost
  • 9,637
  • 1
  • 25
  • 36
  • 1
    Ahh wow! I never used `up`, also I adore the binds! I had to do a few more jumps up the tree as well, to see the effects. The strange thing is that these manipulations mess-up HXT's indenting, too. – Athan Clark Nov 06 '14 at 02:51
  • Another strange thing is that arrow-based filters don't affect the NTZipper-style method of searching & inserting - that is to say if I first put `... >>> hasName "body" >>> trans >>> ...` before `trans`, it has __no effect__. It's really strange, I think we need a better xml _refactoring_ tool :/ – Athan Clark Nov 06 '14 at 16:44
  • Actually, scratch that, adding the additional filter is completely breaking the program (but still compiling) - it's not even sending any output text with `writeDocument`. – Athan Clark Nov 06 '14 at 16:51
  • @AthanClark I've been able to get HXT to do exactly what you were doing with strings; see the above edit. If `writeDocument` stops outputting text then it's probably because of the last comment I made in there (clobbering XML trees). Basically, think of the type `a -> [b]`. You can use this type to do generic mapping (`gmap f x = [f x]`) and generic filtering (`gfilter f x = if f x then [x] else []`), and can compose such maps/filters with Kleisli composition in the List monad. Same here: most operations produce a list of successes; if you fail then that's represented by an empty list. – CR Drost Nov 06 '14 at 18:33