0

I'm working on a soap wrapper for an API (i know REST exists... it's a project in my job), I'm using SUDS library for that.

I found this question and the answer helped me a lot. After trying a couple of things and modified a little bit the script where that question pointed me I got the following error:

UnicodeDecodeError: 'ascii' codec can't decode byte 0x82 in position 2159: ordinal not in range(128)

This occurs when sending the request to the SOAP endpoint, I tried to use unicodedata with the NFKD option to normalize it but I still getting the same erros.

Since this is not a simple text, this is actual data from an audio file, I'm not sure but I guess that modifying it should corrupt the actual data, so I don't know what to do here.

This is what I'm doing, this isn't too different from the procedure in the other question:

file = open('path/to/the/file.mp3', 'rb')
mime_type = 'audio/mpeg'
audio_data = file.read()
bin_param = (audio_data, uuid.uuid4(), mime_type)

with_soap_attachment(client.service.set_greeting, bin_param, request_data)

The script as far as I understood, transforms all the input arguments into a SOAP message string and then encapsulates it into a Request object and send it to the SOAP endpoint, so, the problem is that audio_data contains invalid ascii characters.

Any clue here?

EDIT: 04/04/2013

Here is the actual code of the wrapper

#coding: utf-8
from suds.transport import Request
import re
import uuid

def with_soap_attachment(suds_method, attachment_data, soap_location, *args, **kwargs):
    MIME_DEFAULT = 'text/plain'
    attachment_transfer_encoding = 'binary'
    soap_method = suds_method.method

    if len(attachment_data) == 3:
        data, attachment_id, attachment_mimetype = attachment_data
    elif len(attachment_data) == 2:
        data, attachment_mimetype = attachment_data
        attachment_id = uuid.uuid4()
    elif len(attachment_data) == 1:
        data = attachment_data
        attachment_mimetype = MIME_DEFAULT
        attachment_id = uuid.uuid4()

    soap_client = suds_method.clientclass(kwargs)
    binding = soap_method.binding.input
    soap_xml = binding.get_message(soap_method, args, kwargs)

    boundary_id = 'uuid:%s' % uuid.uuid4()
    root_part_id ='uuid:%s' % uuid.uuid4()
    request_headers = {
      'Content-Type': '; '.join([
          'multipart/related',
          'type="text/xml"',
          'start="<%s>"' % root_part_id,
          'boundary="%s"' % boundary_id,
        ]),
    }
    soap_headers = '\n'.join([
      'Content-Type: text/xml; charset=UTF-8',
      'Content-Transfer-Encoding: 8bit',
      'Content-Id: <%s>' % root_part_id,
      '',
    ])
    attachment_headers = '\n'.join([
      'Content-Type: %s' % attachment_mimetype,
      'Content-Transfer-Encoding: %s' % attachment_transfer_encoding,
      'Content-Id: <%s>' % attachment_id,
      '',
    ])

    request_text = '\n'.join([
      '',
      '--%s' % boundary_id,
      soap_headers,
      unicode(soap_xml),
      '--%s' % boundary_id,
      attachment_headers,
      data,
      '--%s--' % boundary_id
    ])

    location = soap_location

    headers = suds_method.client.options.headers.copy()
    headers.update(request_headers)
    request = Request(location, request_text)
    request.headers = headers

    response = suds_method.client.options.transport.send(request)
    return response

Here is the entire traceback

---------------------------------------------------------------------------
UnicodeDecodeError                        Traceback (most recent call last)
/home/israelord/.virtualenvs/ringtu-env/local/lib/python2.7/site-packages/django/core/management/commands/shell.pyc in <module>()
----> 1 soap_attachments.with_soap_attachment(service.set_menu_prompt, bin_param, AUTOATTENDANT_ENDPOINT, request)

/home/israelord/Work/4geeks/ringtu/ringtu/services/soap_attachments.py in with_soap_attachment(suds_method, attachment_data, soap_location, *args, **kwargs)
     54       attachment_headers,
     55       data,
---> 56       '--%s--' % boundary_id
     57     ])
     58 

UnicodeDecodeError: 'ascii' codec can't decode byte 0x82 in position 148: ordinal not in range(128)
Community
  • 1
  • 1
iferminm
  • 2,019
  • 19
  • 34
  • try replacing `audio_data = file.read()` with `audio_data = file.read().encode('utf8')`. See if that works. – Ionut Hulub Apr 02 '13 at 00:31
  • @IonutHulub `file.read()` already returns `str`. – wRAR Apr 02 '13 at 00:36
  • @wRAR You don't say? the function `encode('utf8')` encodes the string so that 'ascii' codec can decode it. My only concern is that the data will be corrupted and will become uninterpretable for later use. – Ionut Hulub Apr 02 '13 at 00:38
  • @IonutHulub you do not understand what does `encode` do. – wRAR Apr 02 '13 at 00:39

1 Answers1

1

Well, the wrapper doesn't support binary files as it uses only Content-Transfer-Encoding: 8bit which of course cannot work with binary files. You will probably need to modify it consulting the SOAP-attachments specification.

wRAR
  • 25,009
  • 4
  • 84
  • 97
  • Thank you for your suggestion, I've tried what @lonutHulub said before in one of those "desperate attempts" but it obviously didn't work. I'm not getting a transfer error from the server because the request is never sent because of the encoding error. – iferminm Apr 02 '13 at 15:04
  • Do you have any suggestion on how to solve the encoding issue? – iferminm Apr 02 '13 at 15:05
  • Yes, I have to implement it by hand, I changed the Content-Transfer-Encodinf from 8 bit to Binary on the Attachment Header and left it as 8bit in the Soap Header as explained here http://www.w3.org/TR/SOAP-attachments, thank you very much for that, but I still get the encoding error. – iferminm Apr 02 '13 at 17:24
  • @israelord try adding coding: utf-8 comment to the file, as the problem is apparently caused by joining the binary data with the '\n' literal which is ascii by default. – wRAR Apr 02 '13 at 17:30
  • Hi there, the file already has the #coding: utf-8 comment header. The issue is with the data itself, I debugged it and the problem is when it just concatenates the "data" variable, which contains the bytes from the mp3 file. I updated my question with the actual code of the wrapper. Thank you very much – iferminm Apr 04 '13 at 17:19
  • @israelord the issue is not with the data, but with the encoding used to decode them into `unicode`. Are you sure the header is present in the file with the wrapper? – wRAR Apr 04 '13 at 17:22
  • `from __future__ import unicode_literals` is the third error if you added it yourself. – wRAR Apr 04 '13 at 17:24
  • Still getting the same error, i just updated to the actual state of the code, thank you very much for your help – iferminm Apr 04 '13 at 18:11
  • Uh, I just noticed that `unicode(soap_xml)` item. This code is quite bad, as it mixes `str` and `unicode` vars. You need to make all that vars and literals `str` or `unicode` depending on what the result which is passed to `Request` should be. – wRAR Apr 04 '13 at 19:04
  • yes, it used to be str(soap_xml) and a partner suggested to change it that way. I'll check the Request parameters. – iferminm Apr 04 '13 at 19:14
  • Dude, i made it work, I had a function pre-processing some data and it was returning an unicode object and, that was the problem. Thanks A LOT! – iferminm Apr 04 '13 at 19:51