-1

I have an XML file like this:

<?xml version="1.0" encoding="UTF-8"?>
<items>
  <item type="dict">
    <job_id type="str">id-1</job_id >
    <title type="str">title-1</title>
    <desc type="str">desc-1</desc>
  </item>
  <item type="dict">
    <job_id type="str">id-2</job_id>
    <title type="str">title-2</title>
    <desc type="str">desc-2</desc>
  </item>
</items>

I want to parse this into a dictionary, in such a way that I can access the job using it's id. So the dictionary key will be job_id and the whole job definition will the corresponding value.

Here is what I have tried:

class Job:
    def __init__(self):
        self.path = "path-to-xml-file"
        self.job_dict = {}

    def load_jobs(self, env, path):
        file = read_from_s3(env, full_path) # reads the job file from S3 bucket
        dict = xmltodict.parse(file)

        for item in dict['items']['item']:
            key = item['job_id']
            self.job_dict[key] = item # <-- I get exception on this line

I get the following exception when I try to add element to the dictionary:

[Failure instance: Traceback: <class 'TypeError'>: unhashable type: 'collections.OrderedDict'

Also in the watch window, this is what I see for item:

enter image description here

and this is what I see for key:

enter image description here

Hooman Bahreini
  • 14,480
  • 11
  • 70
  • 137
  • You should debug/print the return value from xmltodict.parse then you'll see that the dictionary structure is not what you thought it would be – DarkKnight May 28 '22 at 08:19

1 Answers1

1

item['job_id'] is a dict. You cant use that as a key in your self.job_dict = {}.

Change it to key = item['job_id']['#text'] instead


Just to better understand the error, the object implementing a dictionary key must implement magic method __hash__(). This means a key must be hashable for the dictionary structure to be optimized.

MohitC
  • 4,541
  • 2
  • 34
  • 55
  • Thanks... I have followed [this](https://stackoverflow.com/questions/40154727/how-to-use-xmltodict-to-get-items-out-of-an-xml-file/40157811#40157811) answer... I am confused if I am missing something? Because according to that answer I wouldn't need `[#text]`? – Hooman Bahreini May 28 '22 at 08:24
  • @HoomanBahreini xml in the mentioned question does not contain additional attributes like `type` in your case. Thats why xmttodict adds an additional layer for you to accommodate type. Print your parsed dict, you will understand the complete structure. – MohitC May 28 '22 at 08:27
  • Your answer solves the problem though... just not sure if there is a better approach? – Hooman Bahreini May 28 '22 at 08:27