0

I have several book's reference be must convert to XML.
I want to create application in Java for this action.

Book's reference:

 Schulz V, Hansel R, Tyler VE. Rational phytotherapy: a physician's guide to herbal   
 medicine. 3rd ed., fully rev. and expand. Berlin: Springer; c1998. 306 p.


XML:

<element-citation publication-type="book" publication-format="print">
    <name>
        <surname>Schulz</surname>
        <given-names>V</given-names>
    </name>
    <name>
        <surname>Hansel</surname>
        <given-names>R</given-names>
    </name>
    <name>
        <surname>Tyler</surname>
        <given-names>VE</given-names>
    </name>
    <source>Rational phytotherapy: a physician's guide to herbal medicine</source>
    <edition>3rd ed., fully rev. and expand</edition>
    <publisher-loc>Berlin</publisher-loc>
    <publisher-name>Springer</publisher-name>
    <year>c1998</year>
    <size units="page">306 p</size>
</element-citation>


How to convert book's reference to XML format?
What do you suggest?

informatik01
  • 16,038
  • 10
  • 74
  • 104
user1874800
  • 339
  • 1
  • 4
  • 11
  • What is the structure of such a reference? Does it have fields and an order of these fields? How can the fields be recognized? –  Jul 05 '13 at 12:45
  • Not clear what book reference is? – vishnu viswanath Jul 05 '13 at 12:48
  • 2
    Personally I think that the near-unstructured nature of the input references will prove your biggest challenge, converting it to Java using eg JAXB is quite easy to do. – fvu Jul 05 '13 at 12:49
  • Check out this answer: [Mapping XML to an object in Java](http://stackoverflow.com/a/16755094/814702). There are also other popular Object to XML binding frameworks, like [Castor](http://castor.codehaus.org/index.html), for example. – informatik01 Jul 05 '13 at 12:53

2 Answers2

2

For example, use JAXB.

  1. Get an XSD for your desired XML format.
  2. Generate java classes from the XSD - see how here.
  3. Implement a simple program that will parse your input file and build a tree with the help of the generated classes. This may be trivial or very difficult depending on your input.
  4. Serialize the result - see how here.

EDIT: As hinted by Joop Eggen you may also use annotations instead of steps 1-3. This makes things maybe even simpler. See how here.

Community
  • 1
  • 1
Lachezar Balev
  • 11,498
  • 9
  • 49
  • 72
0

As you might not be experienced in Java, the dull, simple solution (Java 7):

  • writing the XML as text;
  • parsing with String.split(regex) (Scanner would do too).

Mind, special characters < > & " ' in bookref text might need to be replaced by &lt; &gt; &amp; &quot; &apos;.

String bookRef = "Schulz V, Hansel R, Tyler VE. Rational phytotherapy: a physician's guide to herbal "
        + "medicine. 3rd ed., fully rev. and expand. Berlin: Springer; c1998. 306 p.";

File file = new File("D:/dev/xml-part.txt");
final String TAB = "    ";
try (PrintWriter out = new PrintWriter(new BufferedWriter(new OutputStreamWriter(new FileOutputStream(file), "UTF-8")))) {
    out.println(TAB + "<element-citation publication-type=\"book\" publication-format=\"print\">");

    String[] lines = bookRef.split("\\.\\s*");

    String names = lines[0];
    String[] nameArray = names.split(",\\s*");
    for (String name : nameArray) {
        String[] nameParts = name.split(" +", 2);
        out.println(TAB + TAB + "<name>");
        out.println(TAB + TAB + TAB + "<surname>" + nameParts[0] + "</surname>");
        out.println(TAB + TAB + TAB + "<given-name>" + nameParts[1] + "</given-name>");
        out.println(TAB + TAB + "</name>");
    }
    out.println(TAB + TAB + "<source>" + lines[1] + "</source>");
    ...

    out.println(TAB + "</element-citation>");
} catch (FileNotFoundException | UnsupportedEncodingException ex) {
    Logger.getLogger(Test.class.getName()).log(Level.SEVERE, null, ex);
}
Joop Eggen
  • 107,315
  • 7
  • 83
  • 138