3

I am writing a simple REST API in Python using Flask-RESTful and their documentation says that they are planning to deprecate their object serialization (reqparse) in favor of serializers like marshmallow My API is reading and writing from a MongoDB doc store, using Flask MongoEngine.

I would very much appreciate an example of a use case where I would choose to use an external serializer such as Marshmallow over the MongoEngine built-in serializers on Document object.

2 Answers2

6

The essential difference is that marshmallow does validation.

You don't just take any data from the Internet and stuff it into your database. Validation prevents entering wrong data (malicious or erroneous). Even if the data comes from a trusted user, it is a good idea to validate it to ensure database integrity.

Marshmallow, like flask-restplus, provides validators that validate not only types but also values (min/max for numbers, min/max length for strings, min/max for dates, etc., you can even create your own validators).

Also, an API is not always all CRUD. There may be some business code between API and DB for which it is nice to have Python objects. Mongo's BSON parser won't do that.

MongoEngine provides validation, but it is just before the DB, while validation should happen when entering the API.


BTW, the internal [de|]serialization in flask-restful has been slated for deprecation for a while now, and things seem stalled (GH issue #9). I think there are people out there using flask-restplus + marshmallow, so it may be a way to go.

Here's an alternative:

  • Use Marshmallow for I/O [de|]serialization
  • Use marshmallow-mongoengine to create your marshmallow API schemas as automatically as possible from your MongoEngine schemas
  • Use webargs to parse arguments (inject flask request arguments into marshmallow schemas)
  • Use apispec to document the spec following OpenAPI standard
  • To make things easier, use flask-smorest to hide the webargs/apispec layers and provide a nice interface.

This lib combination is not as mature and featured as monolithic flask-restplus but using marshmallow is nice because it is a great lib and because of the DRYness provided by marshmallow-mongoengine.


µMongo is an alternative to MongoEngine that is based on marshmallow, so it is like MongoEngine with marshmallow-mongoengine included.

Its documentation has a schema that illustrates the different stages of validation: API between client and business objects, and ODM between objects and DB.


(Disclaimer: marshmallow, webargs, apispec and flask-rest-api maintainer, µmongo, mongoengine and flask-mongoengine contributor.)

Jérôme
  • 13,328
  • 7
  • 56
  • 106
1

Mongo uses BSON and they have a dedicated parser util implemented in python.
From source:

Deserialization:

from bson.json_util import loads
loads('[{"foo": [1, 2]}, {"bar": {"hello": "world"}}, {"code": {"$scope": {}, "$code": "function x() { return 1; }"}}, {"bin": {"$type": "80", "$binary": "AQIDBA=="}}]')

# >>> [{u'foo': [1, 2]}, {u'bar': {u'hello': u'world'}}, {u'code': Code('function x() { return 1; }', {})}, {u'bin': Binary('...', 128)}]

Serialization:

from bson import Binary, Code
from bson.json_util import dumps
dumps([{'foo': [1, 2]},
       {'bar': {'hello': 'world'}},
       {'code': Code("function x() { return 1; }", {})},
       {'bin': Binary(b"")}])

# >>> '[{"foo": [1, 2]}, {"bar": {"hello": "world"}}, {"code": {"$code": "function x() { return 1; }", "$scope": {}}}, {"bin": {"$binary": "AQIDBA==", "$type": "00"}}]'

When the object you try to serialize/deserialize is BSON you need to use mongo's dumps and loads or it won't be parsed correctly. When it is a regular JSON you can use either one you like.

Maor Refaeli
  • 2,417
  • 2
  • 19
  • 33
  • If I understand you correctly, what you are suggesting is another way to do the same thing, i.e. serialization and deserialization. I am trying to understand the merits of one method over the other. COuld you please elaborate? – Arjun Venkatraman Oct 03 '18 at 10:55
  • Ah great, edit answered! Thanks. So it's a preference based thing then! – Arjun Venkatraman Oct 03 '18 at 11:04
  • 1
    When it's a BSON with BSON properties such as ObjectId or dates it's preferred to use mongo `dumps` and `loads`. You can [customize marshmallow](https://stackoverflow.com/q/28093824/1918287) to work with BSON but I think it's an awful lot of work for something that can be achieved with no effort whatsoever. – Maor Refaeli Oct 03 '18 at 11:08