1

In my Django application I made soap request using suds library. After that I receive response that looks like this:

productdata = '<Root>
       <Header>
          <User>User</User>
          <Password>Password</Password>
          <OperationType>Response</OperationType>
       </Header>
       <Main>
          <Hotel>
             <HotelName>HotelName1</HotelName>
             <TotalPrice>100</TotalPrice>
             <Location>My Location</Location>
          </Hotel>
         <Hotel>
             <HotelName>HotelName2</HotelName>
             <TotalPrice>100</TotalPrice>
             <Location>My Location</Location>
          </Hotel>
       </Main>
    </Root> '

After that I deserialize this data and save to the database. This is how I deserialize data:

def etree_to_dict(t):
  d = {t.tag: {} if t.attrib else None}
  children = list(t)
  if children:
    dd = defaultdict(list)
    for dc in map(etree_to_dict, children):
      for k, v in dc.iteritems():
        dd[k].append(v)
    d = {t.tag: {k:v[0] if len(v) == 1 else v for k, v in dd.iteritems()}}
  if t.attrib:
    d[t.tag].update(('@' + k, v) for k, v in t.attrib.iteritems())
  if t.text:
    text = t.text.strip()
    if children or t.attrib:
      if text:
        d[t.tag]['#text'] = text
    else:
      d[t.tag] = text
  return d

And here is I save the data to database:

e = ET.fromstring(productdata)
d = etree_to_dict(e)
hotels = d['Root']['Main']['Hotel']

for p in hotels:
    product = Product()
    p.hotelname = p['HotelName']
    p.totalprice = p['TotalPrice']
    p.location = p['Location']
    p.save()

And everything woks fine. But when I receive data which contain Ü symbol in Location tag, I’ve got the error:

`UnicodeEncodeError`, `'ascii' codec can't encode character u'\xdc' in position 20134: ordinal not in range(128)`. `Unicode error hint: The string that could not be encoded/decoded was: ARK GÜELL A`. 

Django traceback said that problem in this line:

e = ET.fromstring(productdata)

Can anybody help me to solve this problem. Thanks a lot!

ilse2005
  • 11,189
  • 5
  • 51
  • 75
Zagorodniy Olexiy
  • 2,132
  • 3
  • 22
  • 47

1 Answers1

1

I think you have to encode it manually from UTF-8:

ElementTree.fromstring(productdata.encode('utf-8'))
ilse2005
  • 11,189
  • 5
  • 51
  • 75
  • Thank you for your answer, i change the line `e = ET.fromstring(productdata)` to `e = ET.fromstring(productdata.decode('utf-8'))`, but i have still thr same error – Zagorodniy Olexiy Mar 03 '16 at 11:07
  • Sorry, I think I mixed up `encode` and `decode`. Please check with `encode` – ilse2005 Mar 03 '16 at 11:12