5

I'm trying to use voluptuous to validate JSON input from HTTP request. However, it doesn't seem to handle unicode string to well.

from voluptuous import Schema, Required
from pprint import pprint

schema = Schema({
    Required('name'): str,
    Required('www'): str,
})

data = {
    'name': 'Foo',
    'www': u'http://www.foo.com',
}

pprint(data)
schema(data)

The above code generates the following error:

 voluptuous.MultipleInvalid: expected str for dictionary value @ data['www']

However, if I remove the u notation from the URL, everything works fine. Is this a bug or am I doing it wrong?

ps. I'm using python 2.7 if it has anything to do with it.

Simeon Visser
  • 118,920
  • 18
  • 185
  • 180
lang2
  • 11,433
  • 18
  • 83
  • 133
  • Why are you using unicode for your url? – Ward Jul 07 '15 at 15:18
  • @WardC I wasn't doing that particularly. The above was a minimal version to narrow down the problem. In the real code, the data is presented in plain string that pass through nodetest and somehow end up as unicode when it reached voluptuous. – lang2 Jul 07 '15 at 15:22

1 Answers1

6

There are two string types in Python 2.7: str and unicode. In Python 2.7 the str type is not a Unicode string, it is a byte string.

So the value u'http://www.foo.com' indeed is not an instance of type str and you're getting that error. If you wish to support both str and Unicode strings in Python 2.7 you'd need to change your schema to be:

from voluptuous import Any, Schema, Required

schema = Schema({
    Required('name'): Any(str, unicode),
    Required('www'): Any(str, unicode),
})

Or, for simplicity, if you always receive Unicode strings then you can use:

schema = Schema({
    Required('name'): unicode,
    Required('www'): unicode,
})
Simeon Visser
  • 118,920
  • 18
  • 185
  • 180
  • Any ideas how to support both Python 2 and 3? As Python 3 does not have a `unicode` type. – zormit Mar 29 '17 at 02:58
  • 1
    @zormit: You could use `six.string_types` from the `six` library: https://pythonhosted.org/six/#six.string_types. This means you can write: `Any(*six.string_types)` which works both in Python 2 and 3. If you don't want to install a separate library then you could copy the way `string_types` gets calculated from that project's source code: https://github.com/benjaminp/six/blob/master/six.py#L41-L49 – Simeon Visser Mar 29 '17 at 09:27