1

I am trying to parse the XML file using dom package, but here is the error which I got:

unterminatedattribute {invalid attribute list around line 4}

Here is the simple test:

 package require dom;
 set XML "
    <Top>
    <Name name='name' />
    <Group number=1>
    <Member name='name1' test='test1' l=100/>
    </Group>
    </Top>"
set doc [::dom::parse $XML]

set root [$doc cget -documentElement]

set node [$root cget -firstChild]
puts "[$node cget -nodeValue]"
Vardan Hovhannisyan
  • 1,101
  • 3
  • 17
  • 40

2 Answers2

3

That “XML” is actually formally invalid; all attribute values must be quoted. If you can, fix that.

set XML "
    <Top>
    <Name name='name' />
    <Group number='1'>
    <Member name='name1' test='test1' l='100'/>
    </Group>
    </Top>"

If you can't fix that, you might try using tDOM instead in HTML mode (which is a lot laxer about well-formedness constraints, though it also lower-cases all element and attribute names). Mind you, even with that it still fails on your particular input document:

% package require tdom
0.8.3
% set doc [dom parse -html $XML]
error "Unterminated element 'group' (within 'member')" at position 114
">
    <group number=1>
    <member name='name1' test='test1' l=100/>
    </group> <--Error-- 
    </Top>"

Fixing your document is the #1 thing to do!

Donal Fellows
  • 133,037
  • 18
  • 149
  • 215
  • I got once an example where `<!` did even break tdom's `dom parse -html` – Johannes Kuhn Jul 16 '13 at 08:05
  • 1
    There's a limit to how much breakage can be vomited into a “document” before it becomes completely unparseable. [Except by REs.](http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags/1732454#1732454) – Donal Fellows Jul 16 '13 at 13:57
2

The problem is that you have to enclose the element values with " or '. After fixing your XML the parsing was successful.

I usually don't use the dom package, instead I use the tdom package.
The tdom package has a -html option that enables loose parsing.

Johannes Kuhn
  • 14,778
  • 4
  • 49
  • 73