2

I have to compare XML data. There are two sources-

  • Web Service
  • XML files

I don't see any easy way to transform them in same classes and use equals method.

The classes that work with Web Services are auto generated and WSDL isn't simple at all.

So I read the response from Web Service, read the corresponding file, transform them to String with the same formatting ( removed spaces, \n\r characters, and so on ) and then use String.equals() method.

The issue is the Web services's empty tags are written next way :

<EmptyTag/>

but provided files contains this kind of empty tags:

<EmptyTag></EmptyTag>

OK, there is a way to prepare all provided files manually, but I don't like it. Who knows, how it's possible to transform empty tags to the same style ? If there are any ideas how to simplify to process - you are welcome ;)

UPDATE

I don't parse the xml. The file's data is just read and transformed to expected format. The object's structure from Web Service's response is transformed to xml string in the next way:

    marshaller.marshal(new JAXBElement<response_class_name>(new QName("response_class_name"),
       response_class_name.class, response_object), stringWriter);
hakre
  • 193,403
  • 52
  • 435
  • 836
StKiller
  • 7,631
  • 10
  • 43
  • 56
  • You could do it with a regex replacement, but I'm not confident enough in my regex skills to post an answer. – Tony Casale May 09 '11 at 15:40
  • 4
    How are you parsing the XML? Every XML parsing library I've ever worked with handles this situation opaquely (meaning you never have to deal with it yourself). If you're parsing the XML yourself (with substring, indexof etc.), then there's your problem. – MusiGenesis May 09 '11 at 15:42
  • @MusiGenesis - look please to the updated question. – StKiller May 09 '11 at 15:48
  • 1
    if I understand your edit, you're not parsing the XML response at all - you're just checking the response as an entire string. Instead, you should use a parser to extract the inner contents of the `` element; this should correctly return "" regardless of how that tag is structured (`` or ``), or it will return null if the tag isn't even there. – MusiGenesis May 09 '11 at 16:00
  • Underscore-java library can read xml to map and generate xml from map. Self-closing tags are supported. – Valentyn Kolesnikov Oct 02 '20 at 01:54

6 Answers6

5

For Java I would use XMLUnit to compare the files, as it compares xml files using their structure, not as strings (it may or may not ignore whitespace, depending on settings).

Kathy Van Stone
  • 25,531
  • 3
  • 32
  • 40
2

The program xmllint will do the trick:

$ echo '<EmptyTag></EmptyTag>' | xmllint -
<?xml version="1.0"?>
<EmptyTag/>
ceving
  • 21,900
  • 13
  • 104
  • 178
2

You could use Java's regular expressions module to replace all occurrences of "<([^/]+?)/>" with "<\\1></\\1>". This will expand the first form ("<EmptyTag/>") to the second form ("<EmptyTag></EmptyTag>").

NPE
  • 486,780
  • 108
  • 951
  • 1,012
2

you can replace "<(\\w+)([^>]*)?>\\s*</\\1>" with "<$1$2 />" beforehand

edit or "<(\\w+)( [^/>]*)?/>" with "<$1$2></$1>" for the otherway around ;)

ratchet freak
  • 47,288
  • 5
  • 68
  • 106
  • I need only empty tags. Searching regexp isn't right in this case. But the replacement part is working, thank you ;) – StKiller May 09 '11 at 16:02
1

There are two options:

  1. You can use something like XMLUnit to compare the documents to ensure that they semantically equivalent.
  2. You can read both xml files in using the same parser and then write them back out to a string using the same serializer. The serializer should consistently handle self closing tags.
Karthik Ramachandran
  • 11,925
  • 10
  • 45
  • 53
0

I would probably use XSLT to tranform both xml-files into the same format, but I don't know if that is the easiest way. There are probably editors that can do formatting for you.

Kaj
  • 10,862
  • 2
  • 33
  • 27