I generally use the Marshmallow project to handle JSON serialisation, deserialisation, and validation. When combined with marshmallow-dataclass or, when using SQLAlchemy database models, marshmallow-sqlalchemy, you can produce Marshmallow schemas straight from existing object definitions. You work with instances of the model themselves, so dataclass-defined class instances or SQLAlchemy ORM model instances.
Marshmallow schemas also let you define what happens with extra values in the JSON document; you can ignore these, or throw an exception for them, and vary this per model (models can be nested as needed). You can reuse schemas to subsets of the fields too.
Your small sample model, using marshmallow-dataclass
, could be defined as:
import marshmallow
from marshmallow_dataclass import dataclass
from typing import List
class BaseSchema(marshmallow.Schema):
class Meta:
unknown = marshmallow.EXCLUDE
@dataclass(base_schema=BaseSchema)
class Address:
city: str
postcode: str
@dataclass(base_schema=BaseSchema)
class Person:
name: str
addresses: List[Address]
and apart from pip install marshmallow-dataclass
before attempting to run the above, that's it. This example uses an explicit base schema to set the unknown
configuration to EXCLUDE
, which means: ignore extra attributes in the JSON when loading.
To either deserialize from JSON data, or to serialise to JSON, create an instance of the schema; each dataclass
class has a Schema
attribute referencing the corresponding (generated) Marshmallow schema object:
>>> schema = Person.Schema()
>>> json = '{ "name": "John", "addresses": [{ "postcode": "EC2 2FA", "city": "London" }, { "city": "Paris", "postcode": "545887", "extra_attribute": "" }]}'
>>> p = schema.loads(json)
>>> p
Person(name='John', addresses=[Address(city='London', postcode='EC2 2FA'), Address(city='Paris', postcode='545887')])
>>> print(type(p)) # should print Person
<class '__main__.Person'>
>>> for a in p.addresses:
... print(type(a)) # prints Address
... print(a.city) # should print London then Paris
...
<class '__main__.Address'>
London
<class '__main__.Address'>
Paris
>>> schema.dumps(p)
'{"name": "John", "addresses": [{"postcode": "EC2 2FA", "city": "London"}, {"postcode": "545887", "city": "Paris"}]}'
The Schema.loads()
and Schema.dumps()
methods accept and produce JSON strings. You can also work with plain Python dictionaries and lists (the types that would be serialisable to JSON using the standard library json
module), via Schema.load()
and Schema.dump()
.
For more complex setups you may need to configure the exact validation rules for fields, or exclude some fields from serialisation. You do this with the standard dataclasses.field()
function, passing in Marshmallow field options via the metadata
argument. marshmallow-dataclass
can work out what exact Marshmallow field type to use, but you can always override this. And you can use the NewType()
class to define reusable definitions for this; SomeType = NewType("SomeType", python_type, field=MarshmallowField, **field_args)
lets you mark dataclass fields as field_name: SomeType
in your project.
Marshmallow is, at least for me, the Swiss Army Knife project of serialisation and deserialisation, and there are lots of resources that integrate with Marshmallow. E.g. I'm looking at building several RESTFul APIs for a customer at the moment, and I'll definitely be using Flask-Smorest to define the API endpoints and generate OpenAPI documentation at the same time. And all I have to do is create the SQLAlchemy models for this, really.
Here is an example Flask RESTful API based on your Person & Address schema, but as SQLALchemy models, served as RESTful API:
# pip install Flask flask-marshmallow flask-smorest flask-sqlalchemy marshmallow-sqlalchemy
import marshmallow
from flask import Flask
from flask.views import MethodView
from flask_marshmallow import Marshmallow
from flask_smorest import Api, Blueprint, abort
from flask_sqlalchemy import SQLAlchemy
app = Flask(__name__)
app.config['API_TITLE'] = 'ContactBook'
app.config['API_VERSION'] = 'v1'
app.config['OPENAPI_VERSION'] = '3.0.3'
app.config['SQLALCHEMY_DATABASE_URI'] = 'sqlite:///:memory:'
api = Api(app)
db = SQLAlchemy(app)
ma = Marshmallow(app)
class Address(db.Model):
id = db.Column(db.Integer, primary_key=True)
city = db.Column(db.String)
postcode = db.Column(db.String)
person_id = db.Column(db.Integer, db.ForeignKey('person.id'), nullable=False)
class Person(db.Model):
id = db.Column(db.Integer, primary_key=True)
name = db.Column(db.String)
addresses = db.relationship('Address', backref='person', lazy=True)
# create tables in the (in-memory, temporary) database
db.create_all()
class BaseSQLAlchemyAutoSchema(ma.SQLAlchemyAutoSchema):
def update(self, instance, **data):
for fname in self.fields:
if fname not in data:
continue
setattr(instance, fname, data.get(fname))
class AddressSchema(BaseSQLAlchemyAutoSchema):
class Meta:
table = Address.__table__
class PersonSchema(BaseSQLAlchemyAutoSchema):
class Meta:
table = Person.__table__
addresses = ma.List(ma.Nested(AddressSchema(unknown=marshmallow.EXCLUDE)))
class PersonQueryArgsSchema(ma.Schema):
name = ma.String()
city = ma.String()
blp = Blueprint(
"people", "people", url_prefix="/people", description="Operations on people"
)
@blp.route("/")
class People(MethodView):
@blp.arguments(PersonQueryArgsSchema, location="query")
@blp.response(200, PersonSchema(many=True))
def get(self, args):
"""List people"""
query = Person.query
if args.get("name"):
query = query.filter(Person.name == args["name"])
if args.get("city"):
query = query.filter(Person.addresses.any(Address.city == args["city"]))
return query
@blp.arguments(PersonSchema(unknown=marshmallow.EXCLUDE))
@blp.response(201, PersonSchema)
def post(self, new_person):
"""Add a new person"""
addresses = new_person.pop("addresses", ())
person = Person(**new_person)
for address in addresses:
person.addresses.append(Address(**address))
db.session.add(person)
db.session.commit()
return person
@blp.route("/<person_id>")
class PersonById(MethodView):
@blp.response(200, PersonSchema)
def get(self, person_id):
"""Get person by ID"""
return Person.query.get_or_404(person_id)
@blp.arguments(PersonSchema(unknown=marshmallow.EXCLUDE, exclude=('addresses',)))
@blp.response(200, PersonSchema)
def put(self, updated_person_data, person_id):
"""Update existing person"""
person = Person.query.get_or_404(person_id)
PersonSchema().update(person, **updated_person_data)
db.session.commit()
return person
@blp.response(204)
def delete(self, person_id):
"""Delete person"""
db.session.delete(Person.query.get_or_404(person_id))
api.register_blueprint(blp)
Voila, full-featured REST API that lets us list, updated, created and deleted Person
entries.