You can use standard tools for it
- Use the tool
xjc
from your jdk to generate Java classes from schema
Since Java 9
you must use explicitly add JAXB as module with –add-modules java.se.ee
See: How to resolve java.lang.NoClassDefFoundError: javax/xml/bind/JAXBException in Java 9
Since Java 11 you have to download xjc
in an extra step from https://javaee.github.io/jaxb-v2/
- Read in as
XML
write out as JSON
using Jackson
Example
With https://schema.datacite.org/meta/kernel-4.1/metadata.xsd
1. Use the tool xjc
from your jdk
In my example I will use a fairly complex example based on datacite schemas.
/path/to/jdk/bin/xjc -d /path/to/java/project \
-p stack24174963.datacite \
https://schema.datacite.org/meta/kernel-4.1/metadata.xsd
This will reply with
parsing a schema...
compiling a schema...
stack24174963/datacite/Box.java
stack24174963/datacite/ContributorType.java
stack24174963/datacite/DateType.java
stack24174963/datacite/DescriptionType.java
stack24174963/datacite/FunderIdentifierType.java
stack24174963/datacite/NameType.java
stack24174963/datacite/ObjectFactory.java
stack24174963/datacite/Point.java
stack24174963/datacite/RelatedIdentifierType.java
stack24174963/datacite/RelationType.java
stack24174963/datacite/Resource.java
stack24174963/datacite/ResourceType.java
stack24174963/datacite/TitleType.java
stack24174963/datacite/package-info.java
2. Read in as XML
write out as JSON
using Jackson
import com.fasterxml.jackson.databind.ObjectMapper;
import com.fasterxml.jackson.databind.SerializationFeature;
import stack24174963.datacite.Resource;
public class HowToXmlToJsonWithSchema {
@Test
public void readXmlAndConvertToSchema() throws Exception {
String example = "schemas/datacite/kernel-4.1/example/datacite-example-complicated-v4.1.xml";
try (InputStream in = Thread.currentThread().getContextClassLoader().getResourceAsStream(example)) {
Resource resource = JAXB.unmarshal(in, Resource.class);
System.out.println(asJson(resource));
}
}
private String asJson(Object obj) throws Exception {
StringWriter w = new StringWriter();
new ObjectMapper().configure(SerializationFeature.INDENT_OUTPUT, true).writeValue(w, obj);
String result = w.toString();
return result;
}
}
Prints:
{
"identifier" : {
"value" : "10.5072/testpub",
"identifierType" : "DOI"
},
"creators" : {
"creator" : [ {
"creatorName" : {
"value" : "Smith, John",
"nameType" : "PERSONAL"
},
"givenName" : "<?xml version=\"1.0\" encoding=\"UTF-16\"?>\n<givenName xmlns=\"http://datacite.org/schema/kernel-4\" xmlns:xsi=\"http://www.w3.org/2001/XMLSchema-instance\">John</givenName>",
"familyName" : "<?xml version=\"1.0\" encoding=\"UTF-16\"?>\n<familyName xmlns=\"http://datacite.org/schema/kernel-4\" xmlns:xsi=\"http://www.w3.org/2001/XMLSchema-instance\">Smith</familyName>",
"nameIdentifier" : [ ],
"affiliation" : [ ]
}, {
"creatorName" : {
"value" : "つまらないものですが",
"nameType" : null
},
"givenName" : null,
"familyName" : null,
"nameIdentifier" : [ {
"value" : "0000000134596520",
"nameIdentifierScheme" : "ISNI",
"schemeURI" : "http://isni.org/isni/"
} ],
"affiliation" : [ ]
} ]
},
"titles" : {
"title" : [ {
"value" : "Właściwości rzutowań podprzestrzeniowych",
"titleType" : null,
"lang" : "pl"
}, {
"value" : "Translation of Polish titles",
"titleType" : "TRANSLATED_TITLE",
"lang" : "en"
} ]
},
"publisher" : "Springer",
"publicationYear" : "2010",
"resourceType" : {
"value" : "Monograph",
"resourceTypeGeneral" : "TEXT"
},
"subjects" : {
"subject" : [ {
"value" : "830 German & related literatures",
"subjectScheme" : "DDC",
"schemeURI" : null,
"valueURI" : null,
"lang" : "en"
}, {
"value" : "Polish Literature",
"subjectScheme" : null,
"schemeURI" : null,
"valueURI" : null,
"lang" : "en"
} ]
},
"contributors" : {
"contributor" : [ {
"contributorName" : {
"value" : "Doe, John",
"nameType" : "PERSONAL"
},
"givenName" : "<?xml version=\"1.0\" encoding=\"UTF-16\"?>\n<givenName xmlns=\"http://datacite.org/schema/kernel-4\" xmlns:xsi=\"http://www.w3.org/2001/XMLSchema-instance\">John</givenName>",
"familyName" : "<?xml version=\"1.0\" encoding=\"UTF-16\"?>\n<familyName xmlns=\"http://datacite.org/schema/kernel-4\" xmlns:xsi=\"http://www.w3.org/2001/XMLSchema-instance\">Doe</familyName>",
"nameIdentifier" : [ {
"value" : "0000-0001-5393-1421",
"nameIdentifierScheme" : "ORCID",
"schemeURI" : "http://orcid.org/"
} ],
"affiliation" : [ ],
"contributorType" : "DATA_COLLECTOR"
} ]
},
"dates" : null,
"language" : "de",
"alternateIdentifiers" : {
"alternateIdentifier" : [ {
"value" : "937-0-4523-12357-6",
"alternateIdentifierType" : "ISBN"
} ]
},
"relatedIdentifiers" : {
"relatedIdentifier" : [ {
"value" : "10.5272/oldertestpub",
"resourceTypeGeneral" : null,
"relatedIdentifierType" : "DOI",
"relationType" : "IS_PART_OF",
"relatedMetadataScheme" : null,
"schemeURI" : null,
"schemeType" : null
} ]
},
"sizes" : {
"size" : [ "256 pages" ]
},
"formats" : {
"format" : [ "pdf" ]
},
"version" : "2",
"rightsList" : {
"rights" : [ {
"value" : "Creative Commons Attribution-NoDerivs 2.0 Generic",
"rightsURI" : "http://creativecommons.org/licenses/by-nd/2.0/",
"lang" : null
} ]
},
"descriptions" : {
"description" : [ {
"content" : [ "\n Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod tempor invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua. At vero eos et accusam et justo duo dolores et ea rebum. Stet clita kasd gubergren, no sea\n takimata sanctus est Lorem ipsum dolor sit amet. Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod tempor invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua. At vero eos et accusam et justo duo dolores\n et ea rebum. Stet clita kasd gubergren, no sea takimata sanctus est Lorem ipsum dolor sit amet.\n " ],
"descriptionType" : "ABSTRACT",
"lang" : "la"
} ]
},
"geoLocations" : null,
"fundingReferences" : null
}
For example input XML :
<?xml version="1.0" encoding="UTF-8"?>
<resource xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://datacite.org/schema/kernel-4" xsi:schemaLocation="http://datacite.org/schema/kernel-4 http://schema.datacite.org/meta/kernel-4.1/metadata.xsd">
<identifier identifierType="DOI">10.5072/testpub</identifier>
<creators>
<creator>
<creatorName nameType="Personal">Smith, John</creatorName>
<givenName>John</givenName>
<familyName>Smith</familyName>
</creator>
<creator>
<creatorName>つまらないものですが</creatorName>
<nameIdentifier nameIdentifierScheme="ISNI" schemeURI="http://isni.org/isni/">0000000134596520</nameIdentifier>
</creator>
</creators>
<titles>
<title xml:lang="pl">Właściwości rzutowań podprzestrzeniowych</title>
<title xml:lang="en" titleType="TranslatedTitle">Translation of Polish titles</title>
</titles>
<publisher>Springer</publisher>
<publicationYear>2010</publicationYear>
<subjects>
<subject xml:lang="en" subjectScheme="DDC">830 German & related literatures</subject>
<subject xml:lang="en">Polish Literature</subject>
</subjects>
<contributors>
<contributor contributorType="DataCollector">
<contributorName nameType="Personal">Doe, John</contributorName>
<givenName>John</givenName>
<familyName>Doe</familyName>
<nameIdentifier nameIdentifierScheme="ORCID" schemeURI="http://orcid.org/">0000-0001-5393-1421</nameIdentifier>
</contributor>
</contributors>
<language>de</language>
<resourceType resourceTypeGeneral="Text">Monograph</resourceType>
<alternateIdentifiers>
<alternateIdentifier alternateIdentifierType="ISBN">937-0-4523-12357-6</alternateIdentifier>
</alternateIdentifiers>
<relatedIdentifiers>
<relatedIdentifier relatedIdentifierType="DOI" relationType="IsPartOf">10.5272/oldertestpub</relatedIdentifier>
</relatedIdentifiers>
<sizes>
<size>256 pages</size>
</sizes>
<formats>
<format>pdf</format>
</formats>
<version>2</version>
<rightsList>
<rights rightsURI="http://creativecommons.org/licenses/by-nd/2.0/">Creative Commons Attribution-NoDerivs 2.0 Generic</rights>
</rightsList>
<descriptions>
<description xml:lang="la" descriptionType="Abstract">
Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod tempor invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua. At vero eos et accusam et justo duo dolores et ea rebum. Stet clita kasd gubergren, no sea
takimata sanctus est Lorem ipsum dolor sit amet. Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod tempor invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua. At vero eos et accusam et justo duo dolores
et ea rebum. Stet clita kasd gubergren, no sea takimata sanctus est Lorem ipsum dolor sit amet.
</description>
</descriptions>
</resource>