0

Here with the help of XSDATA and json file input I have generated Model class. I am passing the JSON to a method for getting converted as Model. Below are the files code. sample.xml

<?xml version="1.0"?>
<catalog>
    <books>
        <book id="bk101">
            <author>Gambardella, Matthew</author>
            <title>XML Developer's Guide</title>
            <genre>Computer</genre>
            <price>44.95</price>
            <publish_date>2000-10-01</publish_date>
            <description>An in-depth look at creating applications 
      with XML.</description>
        </book>
        <book id="bk101">
            <author>Gambardella, Matthew</author>
            <title>XML Developer's Guide</title>
            <genre>Computer</genre>
            <price>44.95</price>
            <publish_date>2000-10-01</publish_date>
            <description>An in-depth look at creating applications 
      with XML.</description>
        </book>
    </books>
</catalog>

XML file converted to JSON sample.json

{
    "catalog": {
        "books": {
            "book": [
                {
                    "@id": "bk101",
                    "author": "Gambardella, Matthew",
                    "title": "XML Developer's Guide",
                    "genre": "Computer",
                    "price": "44.95",
                    "publish_date": "2000-10-01",
                    "description": "An in-depth look at creating applications \n      with XML."
                },
                {
                    "@id": "bk101",
                    "author": "Gambardella, Matthew",
                    "title": "XML Developer's Guide",
                    "genre": "Computer",
                    "price": "44.95",
                    "publish_date": "2000-10-01",
                    "description": "An in-depth look at creating applications \n      with XML."
                }
            ]
        }
    }
}

models.py

from dataclasses import dataclass, field
from typing import List, Optional
from xsdata.models.datatype import XmlDate


@dataclass
class Book:
    class Meta:
        name = "book"

    id: Optional[str] = field(
        default=None,
        metadata={
            "name": "@id",
            "type": "Element",
        }
    )
    author: Optional[str] = field(
        default=None,
        metadata={
            "type": "Element",
        }
    )
    title: Optional[str] = field(
        default=None,
        metadata={
            "type": "Element",
        }
    )
    genre: Optional[str] = field(
        default=None,
        metadata={
            "type": "Element",
        }
    )
    price: Optional[float] = field(
        default=None,
        metadata={
            "type": "Element",
        }
    )
    publish_date: Optional[XmlDate] = field(
        default=None,
        metadata={
            "type": "Element",
        }
    )
    description: Optional[str] = field(
        default=None,
        metadata={
            "type": "Element",
        }
    )


@dataclass
class Books:
    class Meta:
        name = "books"

    book: List[Book] = field(
        default_factory=list,
        metadata={
            "type": "Element",
        }
    )


@dataclass
class Catalog:
    class Meta:
        name = "catalog"

    books: Optional[Books] = field(
        default=None,
        metadata={
            "type": "Element",
        }
    )


@dataclass
class Models:
    class Meta:
        name = "models"

    catalog: Optional[Catalog] = field(
        default=None,
        metadata={
            "type": "Element",
        }
    )

model_con.py

import logging
import sys
import json
from xsdata.formats.dataclass.context import XmlContext
from xsdata.formats.dataclass.parsers import JsonParser
from xsdata.formats.dataclass.parsers.config import ParserConfig
from models.models import Models

from xml_to_json import XmlToJson


class Utilities:
    @staticmethod
    def json_to_model(metadata_json: str) -> Models:
        logger = logging.getLogger(__name__)
        logger.info('%s: Converting JSON to model', sys._getframe().f_code.co_name)
        context: XmlContext = XmlContext()
        config = ParserConfig(fail_on_unknown_attributes=False, fail_on_unknown_properties=False)
        parser: JsonParser = JsonParser(context=context, config=config)
        extract_model: Models = parser.from_string(metadata_json, Models)
        return extract_model


if __name__ == "__main__":
    extract_meta_path = r"C:\XSDATA\sample.xml"
    xtj = XmlToJson()
    xtj.xml_file_name = extract_meta_path
    # print(xtj.json_data)
    ut = Utilities()
    md = ut.json_to_model(metadata_json=xtj.json_data)
    print(md)
   

For one "book" value inside "books" the model is unable to fetch the value, while for more than one "book" it is able to return that value inside model. Below I have attached an example: sample.json -> with only one book

{
    "catalog": {
        "books": {
            "book": {
                "@id": "bk101",
                "author": "Gambardella, Matthew",
                "title": "XML Developer's Guide",
                "genre": "Computer",
                "price": "44.95",
                "publish_date": "2000-10-01",
                "description": "An in-depth look at creating applications \n      with XML."
            }
        }
    }
}

Model Generated: Models(catalog=Catalog(books=Books(book=[])))

And above sample.json file contains multiple book inside books the model generated is:

Models(catalog=Catalog(books=Books(book=[Book(id='bk101', author='Gambardella, Matthew', title="XML Developer's Guide", genre='Computer', price=44.95, publish_date=XmlDate(2000, 10, 1), description='An in-depth look at creating applications \n      with XML.'), Book(id='bk101', author='Gambardella, Matthew', title="XML Developer's Guide", genre='Computer', price=44.95, publish_date=XmlDate(2000, 10, 1), description='An in-depth look at creating applications \n      with XML.')])))

I am unable to understand for value inside books, why is the model unable to fetch the values while for multiple book inside books the List[Book] is working fine? Apart from it what can be the change I can make to Models class to handle this error?

  • note that `books` as a dictionary can only have one value for any given key (`book`) and this structure will not work for you. `books` wants to either be a list or the keys of `books` need to be something that incorporates/is the book id. – JonSG Apr 04 '23 at 13:55
  • Can you elaborate more about it please? @JonSG – Ayush Singh Apr 04 '23 at 13:59
  • Try creating a variable in python based on what you posted in as your "sample.json". You will see that the resulting dictionary only retains the second book as the key `"book"` inside `"books"` can only point to one thing. – JonSG Apr 04 '23 at 14:08
  • But having book more than one time is working and is getting in the model as well, why? – Ayush Singh Apr 04 '23 at 14:36
  • Here is what the result was when I tried to have multiple book under books Models(catalog=Catalog(books=Books(book=[Book(id='bk101', author='Gambardella, Matthew', title="XML Developer's Guide", genre='Computer', price=44.95, publish_date=XmlDate(2000, 10, 1), description='An in-depth look at creating applications \n with XML.'), Book(id='bk101', author='Gambardella, Matthew', title="XML Developer's Guide", genre='Computer', price=44.95, publish_date=XmlDate(2000, 10, 1), description='An in-depth look at creating applications \n with XML.')]))) – Ayush Singh Apr 04 '23 at 14:37
  • This is the case when we had only one book under books Models(catalog=Catalog(books=Books(book=[]))) – Ayush Singh Apr 04 '23 at 14:37

1 Answers1

0

There seems to be a problem when converting XML to JSON. The list is mapped as a key-value pair against the key book, the same for every new entry.

This is the correct conversion of Sample XML data to JSON.

You can verify always verify your conversion using free online conversion tools. Here is a link below to demonstrate how to do so.

Click here

{
   "catalog":{
      "books":{
         "book":[
            {
               "@id":"bk101",
               "author":"Gambardella, Matthew",
               "title":"XML Developer's Guide",
               "genre":"Computer",
               "price":"44.95",
               "publish_date":"2000-10-01",
               "description":"An in-depth look at creating applications \n      with XML."
            },
            {
               "@id":"bk101",
               "author":"Gambardella, Matthew",
               "title":"XML Developer's Guide",
               "genre":"Computer",
               "price":"44.95",
               "publish_date":"2000-10-01",
               "description":"An in-depth look at creating applications \n      with XML."
            }
         ]
      }
   }
}

This replicates your data model, where

  • catalog is made of books
  • books is a list of book

Here is the answer that will help you convert XML to JSON correctly

The extended solution for you problem

import xmltodict

doc_2 = """<?xml version="1.0"?>
<catalog>
    <books>
        <book id="bk101">
            <author>Gambardella, Matthew</author>
            <title>XML Developer's Guide</title>
            <genre>Computer</genre>
            <price>44.95</price>
            <publish_date>2000-10-01</publish_date>
            <description>An in-depth look at creating applications 
      with XML.</description>
        </book>
    </books>
</catalog>"""

def parse_single_elements_to_list(obj):
    if type(obj["catalog"]["books"]["book"]) is dict:
        obj["catalog"]["books"]["book"] = [obj["catalog"]["books"]["book"]]
    return obj

doc = parse_single_elements_to_list(xmltodict.parse(doc_2))
print(doc)

This returns

{
   "catalog":{
      "books":{
         "book":[
            {
               "@id":"bk101",
               "author":"Gambardella, Matthew",
               "title":"XML Developer's Guide",
               "genre":"Computer",
               "price":"44.95",
               "publish_date":"2000-10-01",
               "description":"An in-depth look at creating applications \n      with XML."
            }
         ]
      }
   }
}
Kayvan Shah
  • 375
  • 2
  • 11
  • Hi Kavyan, for more than one value it is getting treated as a List of dictionary but for one record it is getting treated as a dict rather than list of dict. Please try it with the following XML ''' Gambardella, Matthew XML Developer's Guide Computer 44.95 2000-10-01 An in-depth look at creating applications with XML. ''' – Ayush Singh Apr 05 '23 at 04:49
  • That's how XML works - and conversion takes place according to that - for a single element, it is treated as a dictionary, and if more elements are present, it is treated as a list. For your particular use case, you shall write a custom parser that converts those to list – Kayvan Shah Apr 05 '23 at 10:24