0

We have a serious problem on comparing two xml files.We need to find the difference between two xml files based on a key value.We need the difference in terms of xml tags and also values of the xml tags.The key values may be present in different positions in files.For example below are the two xml files example where id is the key for the students.In StudentDetails_one.xml file id 111 is present first, but in StudentDetails_two.xml id 222 is present first.

StudentDetails_one.xml

<Student>
    <Id>111</Id>
    <Name>AAA</Name>
    <City>ABCD</City>
    <Dept>CS</Dept>
    <Mobile Number>11111</Mobile Number>
</Student>
<Student>
    <Id>222</Id>
    <Name>BBB</Name>
    <City>ABCD</City>
    <Dept>IT</Dept>
    <Mobile Number>22222</Mobile Number>
</Student>

StudentDetails_two.xml

<Student>
  <Id>222</Id>
  <Name>CCC</Name>
  <City>ABCD</City>
  <DEPT>IT</DEPT>
  <Mobile Number>22222</Mobile Number>
</Student>
<Student>
  <Id>111</Id>
  <Name>AAA</Name>
  <City>ABCD</City>
  <Dept>CS</Dept>
  <Mobile Number>11111</Mobile Number>
</Student>

The below two differences can be noticed in the above 2 files.

  1. Student id 222 is having different Name value in both these 2 xml files.In first file its BBB and in second file its CCC.
  2. Student id 222 is having xml tag control difference as Dept in first file and DEPT in second file.

Is there any tool or technique to find out the above two differences ?

Note: The above one is just an example.We have the xml files with hundreds of tags.So its really difficult to find out the difference as The student id position is different in both the files

  • For tag matching you can just lowercase them; if they then match lower-case wise, you check them against each other in their original form. If they then differ it's a misspelled duplicate. It's best to read the XML structure with an API to be object/dictionary/array type so you can iterate through them and see if there are ID duplicates that differ from TAG or VALUE. Shouldn't be that hard. What script/programming language do you prefer? –  Mar 22 '13 at 11:35
  • Hi Allendar..nice info..thanks..i generally use c++ language.. – user2090833 Mar 22 '13 at 14:12
  • You could do some research for XML parsers in C++ (example: http://stackoverflow.com/questions/170686/best-open-xml-parser-for-c). What you mainly want is to read out the XML from that parser into a multidimensional-array. From that point on you can do loops (+ inner loops) through that array and check if you have conflicting matches. Based on those conflicts you could write a merging output (through the XML parser again) and write that to a new XML file plus a report what has been found/changed/merged (to do some human verification). –  Mar 22 '13 at 14:16
  • Saw the link and found some nice parsers..its going to make my task easy..thanks a lot again :) – user2090833 Mar 22 '13 at 14:36

3 Answers3

0

There are many tools available for effective file/folder comparision.

Here are some of them,

(1) Araxis Merge

(2) Beyond Compare (I personally recommend this one)

Hope it helps.

Hiren Pandya
  • 989
  • 1
  • 7
  • 20
  • I'd agree with Beyond Compare - it has a rules-driven comparison that will take into account Xml syntax, amongst other things. Saved my life many a time before! – Dan Puzey Mar 22 '13 at 11:37
  • Aren't these for pure mirror-matching with some parameters? What the OP needs is an XML parser with some code that compares the entries to one and another. Maybe I'm wrong, but please verify to the OP the use of those tools for XML comparison and dislocations. -1 for the lack of explanation tho, sorry.. –  Mar 22 '13 at 11:37
  • Thanks for the response..I tried with the Araxis Merge. But the problem is, it doesn't show the difference based on the key value as like above example.Please let me know if this facility is available in thses tools. – user2090833 Mar 22 '13 at 11:39
  • @user2090833 it's exactly what I'm trying to say. You need to match the elements with an XML parser. You can't just "compare" XML files. It will be unreliable and have unwanted results. –  Mar 22 '13 at 11:40
  • Beyond Compare is a utility for comparing things. Things like text files, folders, zip archives, FTP sites, etc. Use it to manage source code, keep folders in sync, compare program output, and validate CD copies. Although there is support for automatic functions, the main goal of Beyond Compare is to help you analyze differences in detail, and carefully reconcile them. It commands a wide range of file and text operations. – Hiren Pandya Mar 22 '13 at 11:42
  • Can you please give an XML example for the OP, Hiren Pandya? The answer is too insignificant now to help the OP. –  Mar 22 '13 at 11:43
  • I have tried to compare two xml files and it shows the basic differences in the syntax like you stated in the example. Key based comparison may or may not be available for the stated tools. But I'll try to explore more, will post if find something relative. – Hiren Pandya Mar 22 '13 at 11:44
  • See this.. Let me know whether I am correct in understanding the requirements or not..!! – Hiren Pandya Mar 22 '13 at 11:56
  • @Allendar yes u r correct..i may need to use some xml parsing technique for this requirement..thanks for the response :) – user2090833 Mar 22 '13 at 13:53
0

Here is the image

http://postimg.org/image/arf785kg3/

This is what I get when I tried the given example. Is it something you are looking for. ?

Hiren Pandya
  • 989
  • 1
  • 7
  • 20
  • No this is not what i am looking for..because here for the student id 222 the dept is same that is IT. It should take the Id as key to find the difference.Hope its clear now.. – user2090833 Mar 22 '13 at 13:42
-1

If I got what you want, there is a lot of tools. kdiff is a good one.

Tommi
  • 3,199
  • 1
  • 24
  • 38
  • Why are we comparing "files"? We have to compare XML, what may be very different in outcome than normal file-comparison. -1 for the lack of explanation.. –  Mar 22 '13 at 11:38
  • @Tommi..i think kdiff doesn't find the difference on key based..it just compares line by line.. – user2090833 Mar 22 '13 at 13:52
  • Yep, @Allendar already pointed it, so it looks like I don't got what you want :) – Tommi Mar 22 '13 at 14:15