0

I'm writing this to identify any existing tools/code for below problem and solution.

Requirement/Expectation: Parse 30+ different xml log file with different data structure and populate one generic format from those xml.

Example(XML DATA):

XML1: <xml>
        <name>abcd</name>
        <mark>10</mark>
        <employee_org_name>org1</employee_org_name>
        <employee_payroll>true</employee_payroll>
      </xml>

XML2: <xml>
        <employee_name>deft</employee_name>
        <score>10</score>
        <org>
            <name>org2</name>
            <payroll>false</payroll>
        </org>
      </xml>

XML3: <xml>
        <org name="org1">
            <employee>
                <name>ryan</name>
                <score>10</score>
            </employee>
            <name>org3</name>
            <payroll>true</payroll>
        </org>
      </xml>

Configuration File Looks Like for XML:

XML1 settings file 
    Employee_name: /name
    Employee_mark: /mark
    Employee_org: /employee_org_name
    Employee_org_Payroll: /employee_payroll
    Employee_extras: anything

XML2 settings file:
    Employee_name: /employee_name
    Employee_mark: /score
    Employee_org: /org/name
    Employee_org_Payroll: /org/payroll
    Employee_extras: anything

XML3 settings file: 
    Employee_name: /org/employee/name
    Employee_mark: /org/employee/score
    Employee_org: /org:name
    Employee_org_Payroll: /org/boolean
    Employee_extras: anything

Example(JSON):

JSON1: {name: "abcd", mark: "10", "employee_org_name": "org1", "employee_payroll":  "org1"}

JSON2: {employee_name: "deft", score: "10", "details": {name: "org1","payroll" "true"}}

JSON3: {"org1" : {"employee": {"name: "ryan", points: "10"}, "name": "org1","payroll" "true"}}

Note: Same way we can have JSON configuration/settings file.

Output(should be generic format and stored in JSON only):

Format: {Employee_name: str Employee_mark: int Employee_org: str Employee_org_Payroll: boolean Employee_extras: Object/Array}

Data:

{
        {Employee_name: abcd, Employee_mark: 10, Employee_org: org1, Employee_org_Payroll: true, Employee_extras: NULL},
        {Employee_name: deft, Employee_mark: 10, Employee_org: org2, Employee_org_Payroll: false, Employee_extras: NULL},                         
        {Employee_name: ryan, Employee_mark: 10, Employee_org: org3, Employee_org_Payroll: true, Employee_extras: NULL}
}

Simple Solution: to write a dedicated class or method for each xml(1,2,3..) format and give node specifies in code level.

(Myself) Expected Solution: Write a Generic Parser - When parser triggered, loads a xml file, reads and understands xml format(1/2/3/.../etc), load respective configuration file of that xml format, process it and writes into generic/common xml output format.

please let me know if my question or content is unclear or need more information. I'm here!

Thank you advance!

1 Answers1

0

Finally, I found this code which is mostly gives an solution to my problem, only thing is i need to work whether or not to use for or HashMap.

https://github.com/niteshapte/generic-xml-parser