3

Assuming an XML file with unknown structure (i.e., unknown element and attribute names), like

<RootElement>
   <Level 1 ...>
        <Level 2 ...>
            ...
        </Level 2>
        <Level 2 ...>
            ...
        </Level 2>
    </Level 1>
    <Level 1 ...>
        <Level 2 ...>
            ...
        </Level 2>
        <Level 2 ...>
            ...
        </Level 2>
    </Level 1>
</RootElement>

Is there any way using StAX to get the full raw text of each element?

At least, how can this be done for the first level, i.e. in the above example (ignoring pretty printing) how can we read the following 2 strings in a Java String variable:

"<Level 1 ...><Level 2...>...</Level 2></Level 1>"

and

"<Level 1 ...><Level 2...>...</Level 2></Level 1>"
Deduplicator
  • 44,692
  • 7
  • 66
  • 118
PNS
  • 19,295
  • 32
  • 96
  • 143

2 Answers2

2

Use an XMLStreamReader and XMLStreamWriter together to get (producee) whatever raw XML you want to. It might seem like you can do some tricks for a more simple solution, but you can't - the XML needs to be parsed or else you are in deep water, and if you'd like to hack a parser, they are usually implemented with internal buffering which makes it a bit of hairy work to correctly cut up an incoming stream.

Edit:Use the parsing pattern in this question to keep track of the level. To write, handle each event type from the input in its own way - note that you can iterator over all the attributes and also namespaces for start element events.

Community
  • 1
  • 1
ThomasRS
  • 8,215
  • 5
  • 33
  • 48
  • I am guessing that this is the solution, but haven't managed to combine the 2 yet. Any code example would help. Thanks! – PNS Dec 04 '11 at 15:46
  • The example reads specific tags ("Whatever 1"), I need the raw XML text (with the markup). Also, it seems that XMLEventReader is more suitable. There seems to be no example code of anything like this around! – PNS Dec 05 '11 at 01:59
  • The example can be modified. Increment and decrement the level as in the example on start and end element events, add support for the other types of events too. – ThomasRS Dec 05 '11 at 08:04
  • It seems you do have read and then write because of the XmlStreamReader api, however it could implement a readRaw() method which reads the entire text including sub elements of an element. Unfortunately it does not exist. – mike g Jun 23 '12 at 02:38
0

No, XMLStreamReader allows you to get the text content of a text only xml node with getElementText(), to get the full content you will have to read the file yourself and grab the elements and reconstruct the XML.

But maybe what you want to do is something else. Why don't you explain why you need this?

Maurício Linhares
  • 39,901
  • 14
  • 121
  • 158