I have a xml file like this:
"HTTP/1.1 100 Continue
HTTP/1.1 200 OK
Expires: 0
Buffer: false
Pragma: No-cache
Cache-Control: no-cache
Server: Transaction_Server/4.1.0(zOS)
Connection: close
Content-Type: text/html
Content-Length: 33842
Date: Sat, 02 Aug 2014 09:27:02 GMT
<?xml version=""1.0"" encoding=""UTF-8""?>
<creditBureau xmlns=""http://www.transunion.com/namespace"" xmlns:xsi=""http://www.w3.org/2001/XMLSchema-instance"">
<document>response</document>
<version>2.9</version>
<transactionControl><userRefNumber>Credit Report Example</userRefNumber>
<subscriber><industryCode>Z</industryCode></subscriber></transactionControl>
This is just a part of the entire document. I want to convert this into json.
The problem is how to skip or delete the header part and start parsing from the real xml as in, starting from the <document>
tag.
There are more than a million such files. I can't do it manually. How can I do it? Any help appreciated.